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reliability  in  as-built  equipment  is  also  dealt  with.  The  report  calls  attention  to  some 
of  the  special  problems  such  as  limited  production  and  long  intended  life  associated 
with  evaluating  sonar  equipment  reliability.  It  concludes  with  several  recommenda¬ 
tions  directed  toward  the  systematic  improvement  of  sonar  hardware. 
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RELIABILITY  AND  SERVICE  LIFE  CONCEPTS  FOR  SONAR  TRANSDUCER  APPLICATIONS 


1.0  INTRODUCTION  AND  SUMMARY 


In  recent  years,  the  Navy  has  begun  to  include  reliability  requirements  in 
procurement  specifications  for  wet-end  sonar  hardware.  For  the  most  part  the 
requested  reliability  evaluations  have  focused  on  exponential  modeling  and  the 
use  of  handbook  methods  originally  developed  primarily  for  electronics  systems. 

In  this  report  we  examine  the  relevance  of  this  and  other  approaches.  Reliability 
concepts  are  reviewed  without  restricting  their  scope  to  the  description  of  a 
single  class  of  operating  behavior.  The  discussion  begins  with  the  introduction 
of  the  mathematical  functions  most  commonly  used  in  reliability  descriptions. 
Modeling  of  the  constant,  increasing,  and  decreasing  hazard  rate  situations 
is  discussed. 

Reliability  problems  tend  to  have  strongly  statistical  aspects.  This  leads 
us  to  deal  with  probability  ideas  and  the  properties  of  distributions.  Reliability 
and  service  life  concepts  are  compared  and  contrasted  from  the  prediction  view¬ 
point.  The  task  of  demonstrating  reliability  in  as-built  equipment  is  also  dealt 
with.  The  report  calls  attention  to  some  of  the  special  problems  such  as  limited 
production  and  long  intended  life  associated  with  evaluating  sonar  equipment 
reliability.  It  concludes  with  several  recommendations  directed  toward  the  syste¬ 
matic  improvement  of  sonar  hardware. 

Most  of  the  material  presented  here  was  developed  in  the  open  periodical 
literature  and  now  has  been  refined  and  cataloged  in  standard  reliability  texts. 
However,  as  a  distinct  autonomous  discipline,  reliability  studies  are  only  about 
35  years  old.  There  seems  to  be  an  important  fragmentation  between  advocates  of 
handbook-style  prediction  and  probabilistic  design  practicioners.  This  author 
views  the  two  approaches  as  complementary--each  with  advantages  and  limitations. 

An  effort  is  made  in  this  report  to  provide  the  background  to  permit  progress  on 
sonar  problems  from  both  points  of  view.  Of  necessity  the  scope  of  this  must  be 
limited.  Hopefully,  however,  a  basis  for  more  specific  and  detailed  studies  is 
estab lished. 

An  attempt  has  been  made  to  present  the  material  in  sufficient  detail  and 
rigorously  enough  to  serve  the  technical  needs  of  managers  of  sonar  upgrading  and 
procurement  programs  and  other  workers  in  the  field.  Obviously  to  realize  the 
economic  benefits  that  usually  accompany  well-structured  reliability  efforts, 
the  importance  of  this  kind  of  pursuit  cannot  be  overemphasized. 
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2.0 


STANDARD  MODELING  CONCEPTS 


Discussions  of  reliability  topics  commonly  begin  with  a  definition  of 
reliability.  There  is  some  variety  among  reliability  definitions  but  a  repre¬ 
sentative  example  might  be:  Reliability  is  the  probability  that  an  equipment 
will  satisfactorily  perform  its  intended  function  for  a  specified  time  when 
operated  in  the  manner  and  for  the  purpose  intended.  This  statement  conveys  a 
general  impression  of  the  reliability  concept  but  is  incomplete.  It  needs  to  be 
supported  by  specifications  of  the  nature  of  the  probability  statement,  what 
constitutes  satisfactory  equipment  performance,  mission  duration,  environmental 
exposures,  proper  use,  etc.  It  is  possible  to  extend  the  reliability  definition 
to  include  the  important  statements  of  qualification.  Clarity  often  suffers 
when  this  is  done.  In  contrast  the  statement  can  be  streamlined  to  simply, 
reliability  is  the  probability  of  success.  Again  communication  on  the  subject 
involves  clarifying  a  number  of  related  circumstances. 

Measuring  reliability  involves  quantifying  a  probability  statement.  Thus 
single  unit  reliability  is  not  directly  observable  but  must  be  inferred  from 
other  information  relating  to  failed  units  within  a  population.  The  important 
relationships  are  developed  and  cataloged  in  Section  2.1.  Sections  2.2,  2.3, 
and  2. A  deal  with  specific  examples  of  situations  exhibiting  constant,  increasing, 
and  decreasing  hazard  rates. 


2 . 1  Some  Reliability  Functions  and  Interrelationships 

Several  important  reliability  functions  and  relationships  are  displayed  in 
Table  I.  The  first  6  line  entries  are  functions  commonly  encountered  in  reliability 
theory.  Actually  the  MTBF  is  not  a  function  but  rather  a  statistic  (measuring 
the  central  tendency  of  f(t),  the  time-to-f ailur e  probability  density  function). 
Probably  the  MTBF  acquires  its  status  because  its  specification  in  a  one -parameter 
model  (such  as  the  exponential  case)  completely  characterizes  the  description. 

The  remaining  reliability  functions  all  depend  on  time  t  here  taken  to  represent 
operating  time  or  the  age  of  the  component/equipment/system  since  manufacture  or 
instal lation. 

The  second  portion  of  Table  I  gives  a  number  of  relationships  connecting 
the  various  reliability  functions.  Unreliability  is  defined  via  Eq.  (2)  as 
the  cumulative  of  the  time-to-f allure  distribution  function.  Unreliability 
and  reliabiity  are  complementary  functions  via  Eq.  (6).  Equations  (2)  and  (6) 
imply  Eq.  (la).  Equation  (lb)  is  derived  in  Appendix  A.  Differentiation  of 
Eq.  (2)  under  the  integral  sign  leads  to  Eq.  (3a)  while  use  of  Eq.  (6)  further 
implies  Eq.  (3c).  Differentiation  of  Eq.  (lb)  implies  Eq.  (Aa)  and  its  equivalent 
Eq.  (Ab).  Use  of  Eq.  (3c)  in  Eq.  (Aa)  leads  to  Eq.  (3b).  Equation  (5a)  defines 
MTBF  and  parts  integration  implies  Eq.  (5b)  as  an  equivalent  statement.  Equation 
(7)  defines  conditional  reliability  as  a  function  of  age  t  at  the  start  of  a 
mission  and  mission  duration  T. 

The  reliability  functions  of  Table  I  are  so  richly  interconnected  that 
specification  of  any  one  of  the  functions  R(t),  U(t),  f(t),  or  \(t)  implies  all 
the  other  quantities  of  interest  including  MTBF  and  R(t,T).  In  contrast,  speci¬ 
fying  MTBF  alone  places  a  single  constraint  on  reliability  modeling  parameters 
and  implies  a  complete  description  only  in  the  case  of  a  single-parameter  model. 
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2.2 


Random  Hazard  Case — Exponential  Reliability 


In  order  to  better  visualize  the  common  forms  of  reliability  modeling,  let 
us  graphically  relate  relia  ility  and  expected  failure  times  to  the  underlying 
hazard  function  X(t).  Mary  systems  experience  a  fairly  stable  minimum  hazard 
rate  only  after  a  period  of  operation  that  separates  congenitally  inferior  units 
from  the  rest  of  the  population  of  similar  items.  The  weak  units  are  referred  to 
as  early  failures  and  the  time  domain  of  their  occurrence  is  termed  the  infant 
mortality  region.  A  period  of  stable  hazard  rate  is  referred  to  as  the  random 
hazard  region.  Typically  as  damage  to  the  system  accumulates  the  hazard  rate 
increases  rapidly,  signaling  entry  into  the  wearout  phase.  Hazard  functions  for 
electronic  and  mechanical  components  are  sketched  in  Fig.  1.  Electronic  com¬ 
ponents  tend  to  exhibit  a  pronounced  region  where  the  hazard  rate  is  constant 
as  shown  in  Fig.  la  (the  celebrated  "bathtub"  curve  of  reliability  studies). 
Wearout  as  suggested  in  Fig.  lb,  tends  to  be  more  prominent  in  mechanical  systems. 
In  this  section  of  the  report,  we  consider  the  reliability  implications  of  a 
constant  hazard  rate.  Early  and  wearout  reliablity  are  dealt  with  in  the  sections 
to  follow. 

The  region  of  stable  hazard  rate  is  referred  to  as  the  random  hazard  region 
because  a  failure  is  equally  as  likely  to  occur  in  any  one  time  interval  as  in 
any  other  such  interval  of  equal  duration.  Applying  Eq.  (lb)  to  this  situation 
(X  =  const .)' yields  for  the  random  hazard  reliability 

R(t)  =  e_At  .  (8) 

Thus  constant  hazard  rate  implies  exponential  reliability.  The  t ime-to-failure 
probability  density  function  is  obtained  by  using  Eq.  (8)  in  Eqs.  (3b)  or  (3c). 
Thus, 

f  (t)  =  Xe_At  .  (9) 

The  function  f(t)  is  itself  an  exponential  (scaled  as  X)  function.  Equations 
(8)  and  (9)  follow  from  a  constant  hazard  rate  X.  For  completeness  we  include  the 
letter  statement  explicitly  as 

X  ( t )  =  X.  (10) 

Equations  (8),  (9),  and  (10)  characterize  what  is  usually  referred  to  as  the 
random  hazard  or  exponential  reliability  situation.  Representative  random  hazard 
reliability  functions  are  plotted  in  Fig.  2.  In  practice  care  must  be  exercised 
to  make  sure  the  random  hazard  description  is  used  appropriately.  This  may  mean 
eliminating  early  failures  via  burn-in  techniques  or  avoiding  the  wearout  region 
by  limiting  the  time  domain  wherein  Eqs.  (8)  through  (10)  are  used. 


2.3  Normally  Distributed  Times  to  Failure — Wearout 

In  the  previous  section  it  was  convenient  to  use  constancy  of  the  hazard  func¬ 
tion  as  a  pol’t  of  departure.  This  corresponded  to  a  static  reliability  situation 
in  which  the  vulnerability  of  the  system  or  component  of  interest  under  applied 
stresses  did  not  change  with  time.  This  is  a  proper  description  of  many  real- 
life  reliability  problems.  There  are  also  numerous  situations  wherein  the  perfor- 
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m.inre  attributes  of  the  item  of  interest  degrade  with  time.  A  wide  variety  of 
wen rout  phenomena  such  as  fatigue,  corrosion,  sputtering,  abrasion,  diffusion, 
etc.,  operate  to  populate  this  category.  In  general,  wearout  is  characterized  by 
the  systematic  loss  of  system  or  component  performance  due  to  material  property 
changes  or  outright  loss  of  working  substance.  The  changes  may  be  induced  by 
applied  thermal  or  mechanical  loads — sputtering,  diffusion,  and  fatigue  crack 
growth  are  examples.  Degradation  processes  such  as  corrosion  and  diffusion  may 
also  proceed  independently  of  applied  load.  Synergistic  effects  such  as  stress 
corrosion  or  corrosion  fatigue  also  occur. 

Loss  of  function  to  wearout  translates  into  an  increasing  vunerability  to 
catastrophic  failure  in  service  under  normal  application  of  stress.  Thus,  the 
hazard  function  (probability  per  unit  time  of  experiencing  a  failure)  is  an 
increasing  function  as  damage  to  the  system  or  component  of  interest  accumulates. 
The  variety  of  vearout  processes  and  variability  of  loading  situations  often  render 
directly  characterizing  the  shape  of  the  increasing  hazard  function  inconvenient. 
Commonly  instead  one  acquires  time-to-failure  information  and  proceeds  to  an 
initial  specification  of  the  time-to-failure  probability  density  function  f(t). 

This  is  typically  a  peaked  function  that  increases  as  hardware  vulnerability 
increases  and  decreases  as  significant  numbers  of  the  test  population  are  lost  to 
failure.  When  f(t)  has  been  characterized  the  corresponding  reliability  and 
hazard  functions  can  be  obtained  analytically  or  numerically  from  F.qs.  (la)  and 
(3b),  respectively.  A  commonly  encountered  wearout  time-to-failure  distribution 
is  the  Gaussian  or  normal  distribution 


fN(t)  =  (l/0t/27)  exp 


t-U. 


(11) 


having  mean  value  or  position  p  t  and  spread  or  dispersion  at.  In  corrosion 
problems  the  times-to-fa I lure  of  similar  units  are  often  log  normally  distributed 
(i.e.  the  logarithms  of  the  times-to-failure  are  distributed  normally).  The  log 
normal  distribut  Ion  Is 


fLN^> 


(1/°lnt 1 1/^" )  exP 


fln  f  ^Int1 

2  * 

i  «lnt 

(12) 


Examples  of  Eqs.  (11)  and  (12)  and  their  corresponding  reliability  and  hazard 
functions  are  plotted  in  Figures  3  and  4. 


2 .4  Infant  Mortality 

Infant  mortality  refers  to  the  early  failure  of  substandard  hardware  items. 
These  units  contain  flaws  or  defects  not  properly  representative  of  the  entire 
population  of  nominally  similar  devices.  As  early  failures  occur  weaker  units 
are  removed  from  service  while  more  rugged  ones  continue  to  function.  Thus,  as 
the  early  phase  progresses  the  probability  per  unit  time  of  experiencing  additional 
failures  decreases.  Early  life  is  characterized  by  a  decreasing  hazard  rate,  a 
decreasing  time-to-failure  density  function,  and  a  reliability  function  that 
decreases  more  rapidly  than  an  exponential  function  at  first  and  then  more  slowly. 
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Examples  of  the  Infant  mortality  hazard,  reliability, 
probability  density  functions  are  presented  in  Figure 
failure  studies  will  be  best  represented  by  time  axis 
ing  to  failures  having  occurred  during  manufacture  or 
to  the  Initiation  of  actual  reliability  testing. 

Figure  5  refers  to  a  purely  early  failure  situation.  That  is,  all  units 
are  taken  to  be  substandard  for  Illustrative  purposes.  Normally  a  test  popula¬ 
tion  will  contain  both  normal  devices  and  congenitally  weak  units.  The  latter 
are  potential  early  failure  candidates  and  may  be  largely  separated  and  prevented 
from  causing  subsequent  service  problems  by  appropriate  preliminary  exercising  or 
burn-in  procedures.  After  burn  in  the  remaining  population  of  hardware  items  can 
be  characterized  as  exhibiting  purely  random  hazard,  purely  wearout,  or  perhaps 
combined  random  and  wearout  behavior.  The  minimum  vulnerability  to  normal  load 
stresses  (force,  pressure,  voltage,  current,  temperature,  etc.)  occurs  during  the 
random  hazard  or  exponential  reliability  phase.  Thus,  for  critical  applications 
early  failures  must  be  systematically  eliminated  (via  burn  in  or  proof  testing, 
for  example).  Similarly,  the  Impact  of  wearout  must  be  ameliorated  by  proper 
parts  selection,  adequate  design  measures,  and  through  preventive  maintenance. 

Early  failures  are  most  conveniently  represented  using  the  Weibull  distribu¬ 
tion,  a  subject  that  is  deferred  to  Section  3.1. 


and  time-to-failure 
5.  Quite  often  early 
displacements  correspond- 
transit  or  otherwise  prior 


3.0  FURTHER  DISTRIBUTIONAL  ASPECTS 

In  the  previous  sections  of  this  report  we  began  to  touch  upon  the  statistical 
aspects  of  reliability  and  service  life  of  hardware.  Times-to-failure  were  found 
to  be  distributed.  A  few  important  distributions  are  in  common  use  to  describe 
several  important  behavior  classes  (random,  early,  and  wearout  failures).  There 
are  many  other  well  established  distributions  that  are  useful  from  time  to  time  in 
reliability  work.  However,  for  the  present  purpose  it  is  necessary  to  limit  the 
scope  of  discussion  here. 

Section  3.1  deals  with  Welbull  statistics,  a  generalization  capable  of 
unifying  the  descriptions  of  the  random  hazard,  wearout,  and  infant  mortality 
situations.  In  Section  3.2  we  Introduce  the  idea  that  reliability  itself  must  be 
distributed.  The  connection  of  distributed  time-to-failure  properties  with 
underlying  system  complexity  and  functional  redundancy  is  touched  on  in  Section  3.3 
and  its  subsections. 


3 . 1  A  Generalized  Description — Welbull  Statistics 

In  1951  Welbull*  Introduced  a  probability  density  function  which  has  proved 
to  be  very  comprehensive  and  well  suited  to  reliability  and  life  studies.  There 
are  both  2  and  3  parameter  versions  of  the  Welbull  distribution.  The  three-param¬ 
eter  probability  density  function  is 

f<t>  ■  03) 

The  corresponding  Welbull  reliability  and  hazard  functions  are 


R(t)  =  exp  ^  ) 


and 

«'>  -  'tV-l6'1  • 


(14) 


(15) 


The  allowed  ranges  of  the  parameters  are: 

y  <  t  <  00 

_  00  <  Y  <  CD 

n  >  0 
B  >  0  . 
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The  two-parameter  Weibull  reliability  model  is  obtained  from  Eqs.  (13) 
through  (13)  by  setting  the  location  parameter  y  equal  to  zero.  Equation  (15) 
represents  equally  well  decreasing,  constant,  or  increasing  hazard  rate  situations 
as  the  parameter  B  takes  values  less  than,  equal  to,  or  greater  than  one 
respectively.  Since  B  has  such  a  dramatic  impact  on  the  character  or  functional 
shape  of  the  Weibull  distribution,  it  is  referred  to  as  the  shape  parameter. 
Weibull  shape  effects  are  illustrated  in  Figure  6.  Changing  only  n  has  the  same 
visual  effect  on  a  plot  of  the  distribution  as  stretching  or  compressing  the 
abscissa  coordinate  scale  and  symmetrically  compressing  or  stretching  the  ordinate 
scale.  This  leaves  the  normalization  of  Eq.  (13)  unaffected.  Thus,  n  is  referred 
to  as  the  scale  parameter  of  the  Weibull  distribution. 

We  have  seen  that  the  Weibull  distribution  has  a  decreasing,  constant,  or 
increasing  associated  hazard  function  depending  on  parameter  choices.  Another 
way  to  verify  the  versatility  of  this  model  is  to  look  at  limiting  forms  of  the 
Weibull  distribution  function  itself.  Reference  2  and  sources  referred  to  therein 
point  out  that  for  6=1,2,  and  3.313  Eq.  (13)  reduces  respectively  to  the  two- 
parameter  exponential  distribution,  the  Rayleigh  distribution,  and  approximately 
to  the  normal  distribution.  The  common  one-parameter  exponential  distribution 
obtains  when  B  =  1  and  y  =  0.  Equation  (13)  is  skewed  to  the  right  for  values  of 
B  up  to  about  3.313  and  skewed  to  the  left  for  B  greater  than  3.313.  In  the 
former  case  (B  <  3.313)  it  is  likely  that  a  choice  can  be  made  such  that  Eq.  (13) 
is  also  a  satisfactory  representation  of  the  log  normal  distribution. 

The  Weibull  distribution  is  very  convenient  in  that  it  allows  the  same  formal 
reliability  description  to  embrace  all  three  important  practical  situations. 
However,  infant  mortality,  random  failures,  and  wearout  are  not  represented 
simultaneously  by  the  same  Weibull  distribution.  When  more  than  one  type  of 
failure  mode  operates  in  a  group  of  items  of  interest,  the  group  is  referred  to 
as  a  mixed  population.  The  reliability  description  of  mixed  populations  is 
discussed  further  in  Section  5.2.3.  Estimating  the  parameters  of  the  Weibull  or 
other  distributions  is  dealt  with  specifically  in  Section  6.2. 


3 .2  Distributed  Reliability — Confidence  Limits 

Reliability  has  been  introduced  as  a  quantity  measuring  the  probability  of 
successful  operation  of  a  given  component  or  system  under  specified  conditions. 
Reliability  is  completely  specified  as  a  deterministic  function  of  time  (and 
loading  and  strength  parameters)  via  Eqs.  (la)  and  (lb)  if  an  appropriate  time-to- 
failure  probability  density  function  f(t)  or  hazard  function  \(t)  is  supplied. 

We  have  no  trouble  reconciling  the  concept  of  deterministic  reliability  with  the 
variability  of  success/failure  outcomes  when  we  actually  operate  equipment. 
Reliability  is  only  thp  probability  of  success  and  not  a  guarantee  of  successful 
operation  in  some  fraction  of  attempts  made. 

The  situation  can  be  likened  to  the  casting  of  dice.  If  a  die  is  formed 
symmetrically,  we  assign  it  an  a  priori  probability  of  1/6  of  showing  any  one  of 
the  six  faces  when  cast.  Even  if  the  die  is  fair  (unloaded),  however,  this  does 
not  assure  that  in  six  throws  each  face  will  show  a  single  time  only.  But  in  a 
large  number  of  throws  the  fractional  exposure  of  each  face  of  the  die  will 


approach  1/6  for  a  fair  die.  If  the  die  Is  loaded,  different  occurrence  fractions 
for  the  six  faces  will  be  obtained  in  this  way.  We  will  have  measured  the  loading 
in  terms  of  the  unsymmetr ical  a  posteriori  probabilities  of  showing  the  six  die 
faces.  Whether  the  die  is  fair  or  not,  the  number  of  times  a  given  face  shows  up 
in  a  certain  number  of  throws  is  a  random  variable  subject  to  fluctuations  under 
a  repetition  of  the  test.  Thus,  the  best  one  can  hope  to  do  is  to  characterize 
the  situation  in  terms  of  observed  averages  and  some  measure  of  the  scatter  of 
the  data  used  to  form  them.  This  is  a  distributional  description  and  the 
probability  of  showin  any  particular  die  face  is  distributed.  In  the  case  of  a 
fair  die  the  distributed  a  posteriori  probability  will  have  a  high  probability  of 
including  the  a  priori  value.  If  the  die  is  loaded,  the  former  will  likely  exclude 
the  latter. 

How  is  reliability  to  be  compared  and  contrasted  with  the  casting  of  dice? 
First  of  all,  a  posteriori  probabilities  are  measured  in  the  same  way  by  operating 
the  equipment  and  counting  successes  and  failures  or  by  casting  dice  and  similarly 
noting  the  outcome.  Many  dice  can  be  used  or  a  single  die  can  be  thrown  repeti¬ 
tively.  Similarly,  many  (identical)  equipments  can  be  exercised  or  a  single 
one  subject  to  appropriate  repair  between  uses.  An  important  contrast  is  that  in 
general  there  is  no  suitable  way  to  assign  an  a  priori  reliability.  Previous 
experience  with  similar  equipment  constitutes  a  related  measurement  rather  than 
an  evaluation  based  on  structural  arguments  and  advanced  independently  of  opera¬ 
tional  experience. 

At  the  beginning  of  this  section  we  noted  that  reliability  is  fully  determined 
if  the  related  time-to-failure  or  hazard  functions  are  completely  specified.  One 
can  invent  reliability  models  where  this  is  imagined  to  be  the  case,  but  as  a 
practical  matter  this  situation  does  not  occur.  In  practice  the  properties  of 
the  continuous  functions  f(t)  or  X(t)  are  inferred  from  a  limited  number  of 
discrete  observations.  The  result  is  that  only  a  statistically  distributed 
description  of  the  parameters  of  these  functions  can  be  specified.  The  derived 
reliability  function  is  also  distributed.  Similar  reliability  conclusions  are 
drawn  directly  if  one  focuses  attention  on  the  unfailed  fraction  of  an  equipment 
population  as  a  function  of  time  rather  than  the  equivalent  indicators  (observed 
failure  times,  number  of  failures  in  intervals  of  equal  duration). 

In  order  to  further  clarify  the  concept  of  distributed  reliability,  let  us 
pursue  a  more  formal  line  of  reasoning.  To  be  specific  and  restrict  the  scope  of 
the  discussion  somewhat,  consider  a  nonreplaceraent  test  of  N  equipments  which  is 
terminated  at  the  occurrence  of  the  r1-*1  observed  failure.  Each  of  the  r  failure 
times  ti,...,  tr  is  recorded.  Let  us  consider  further  for  the  moment  that  we 
have  some  independent  assurance  that  the  equipments  under  test  are  identical  and 
exhibit  exponential  reliability.  (In  nature  radioactive  or  fluorescent  atoms  of 
the  same  kind  meet  these  last  two  requirements — for  manufactured  hardware,  however, 
this  must  be  recognized  as  an  idealization.)  We  take  our  problem  to  be  specifying 
the  parameter  0  of  the  one-parameter  exponential  time-to-failure  distribution 

f(t)  -  je't/e  06) 
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from  the  available  set  of  observed  tt  (i  =  l,...,r).  Notice  Eq.  (16)  is  simply 
Eq.  (9)  written  in  terms  of  0  =  1/X  =  MTBF . 

On  the  basis  of  maximum  likelihood  theory  the  best  estimate  0  of  the  true 
MTBF  0  is  (see  Appendix  10.C.1  of  Ref.  3,  for  example) 


0  = 


+  (N-r)tr ]  . 


(17) 


The  estimator  0  is  in  fact  a  distributed  random  variable  since  application  of 
Eq.  (17)  to  more  than  one  nominally  identical  experiments  will  yield  different 
results.  Many  such  repetitions  would  produce  an  experimental  determination  of 
the  distribution  of  estimators  f(0).  Since  we  are  dealing  with  a  known  time-to- 
failure  distribution  [Eq.  (16)],  this  information  can  also  be  developed  analytically. 
In  the  pioneering  study  in  this  area  Epstein  and  Sobel^1  have  shown  that  the 
distribution  of  estimators  based  on  observing  r  failures  among  N  units  drawn  from 
an  exponential  population  is 


f(0)  =  (l/(r-l)i) (r/0)r(0)r_1exp(-r0/0).  (18) 


The  average  or  expected  value  of  0  is 


Similarly 


E(0) 


0f(0)d0 

0 


“  **3  "  e‘ 


E(02)  = 


The  variance  is 


E(02) 


-  (E(0))2 


=  02/r. 


And  the  coefficient  of  variation  is 


(19) 


(20) 


(21) 


COV  =  a^/pg  =  l//r  .  (22) 

As  an  example  Eq.  (18)  is  plotted  in  Fig.  7  for  the  case  r  -  10,  0=1.  The 
cumulative  of  Eq.  (18)  is  also  shown  in  Fig.  7  from  which  we  see  for  example  that 
the  90Z  two-sided  confidence  statement  that  can  be  made  with  respect  to  the 
expected  range  of  0  is 

0.520  <  0  <  1.569  •  (23) 
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Equation  (23)  states  that  if  exponential  units  having  true  MTBF 
until  10  failures  are  observed,  the  estimator  0  will  have  a  90% 
being  in  the  indicated  range.  Normally,  however,  the  true  9  is 
would  like  to  reverse  the  philosophy  of  Eq.  (23)  to  sharpen  the 
9  as  an  estimator  of  9. 


9  are  tested 
probability  of 
unknown  and  we 
description  of 


We  turn  again 
variable 


to 


the  work  of 


Epstein  and  Sobel 


4 


who  showed 


that  the  random 


z 


2r9 

0 


(24) 


is  X2  distributed  with  2r  degrees  of  freedom.  That  is 


f(z)  =  (l/2r(r-l)l) (zr_1)e‘z/2  . 


(25) 


Using  the  variable  transformation  methods  described  in  Chapter  5  of  Ref.  5, 

Eqs.  (18)  and  (25)  are  seen  to  be  trivially  related.  Via  reasoning  along  the 
same  line  (the  process  is  amplified  in  Appendix  B),  one  obtains  the  distribution 
of  true  MTBF  values  9  compatible  with  a  single  observed  estimate  9.  Thus 

f (9)  =  (l/r!0) (r9/e)r+1exp(-r9/9).  (26) 


The  origin  moments  of  Eq.  (26)  of  interest  are 


and 


E(e)  .  (—pje  -  „ 


E(92)  = 


r202 


(r-1 ) (r-2) 


(27) 


(28) 


The  variance  and 


and 


coefficient  of  variation  are 
2  _  r202 
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(r-l)2(r-2) 


C0V  *  °0/lJe  “  * 


(29) 


(30) 


Equation_(26)  together  with  its  cumulative  is  plotted  for  the  particular  case 
r  ■*  10,  9  =  0.9  in  Fig.  8.  The  cumulative  of  Eq.  (26)  can  be  used  directly  to 
make  any  desired  confidence  statement  with  respect  to  the  ranging  of  0  about  0 
for  a  given  r.  For  example,  from  Fig.  8  the  90%  two-sided  confidence  statement 
is 
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0.56  <  0  <  1.65.  (31) 

Or  since  0  =  0.9  an  equivalent  representation  is 

0.62  0  <  0  <  1.83  0  .  (32) 

The  approach  described  in  the  proceeding  paragraph  is  awkward  since  F.q.  (26) 
must  be  integrated  for  each  pair  of  experimental  parameters  0,  and  r.  Thus, 
standard  practice  involves  instead  the  combined  use  of  F.qs.  ( 24 )  and  (25)  and 
available  tables  based  on  the  X2  distribution.  The  probability  statement  on  z 
at  a  confidence  level  of  1-a  is 


[<2a 


-a/2) ,2r  - 


M  <  x2  1  = 
9  -  a/2 ,2r  J 


1  -  a 


(33) 


An  equivalent  statement  providing  the  two-sided  confidence  limits  on  9  at  the 
1-a  confidence  level  is 


L 


2 


2r9 

x2 

a/2,2r 


<  0  < 


2r9 

X2 

il-a/2) ,2r 


2* 


(34) 


a 

One-sided  confidence  limits  are  implied  by  F.q.  (34)  under  the  replacement  j  +  a 
(for  L  or  U  but  not  both  since  the  conjugate  limit  moves  off  to  ±«). 

Since  there  is  a  one-to-one  correspondence  between  MTBF  and  exponential 
reliability  via  F.q.  (8),  and  using  F.q.  (34)  the  confidence  interval  for  the 
reliability  function  may  be  specified  as 

e-t^L2  <  R(t)  <  e-t/,U2  .  (35) 

Of  course  the  full  distributional  character  of  the  reliability  function  can  be 
displayed  by  applying  the  methods  of  Appendix  B  to  Eq.  (26)  with  the  proviso 

R(t )  =  e~t/9.  (36) 


This  yields 


f ( R)  =  (l/(r-l)!  )  (r9/t)r(-ln  R)r_1  R(r0/t  _1)  •  (37) 


Equation  (37)  is  plotted  in  Fig.  9  for  the  cases  r  -  10,  9  =  0.9  and  t  =0.1, 

0.2.  One  notices  that  the  reliability  dispersion  as  well  as  its  mean  value  is 
time  dependent.  Another  way  of  visualizing  this  situation  is  displayed  in  Fig. 

10  which  shows  the  time  dependence  of  the  50%  confidence  boundary  and  80%  two- 
sided  confidence  limits  on  reliability  for  the  example  being  considered. 

In  closing  this  section  let  us  consider  a  practical  procurement  example. 

The  Navy  recognizes  that  perfection  of  hardware  performance  and  its  specification 
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are  both  unattainable  on  a  finite  budget.  Thus,  compromises  In  both  areas  are 
commonplace  In  deference  to  the  recognized  statistical  character  of  reliability. 
One  might  require  that  the  reliability  of  some  equipment  for  a  specified  time 
period  be  at  least  90%.  One  might  further  demand  sufficient  testing  under  actual 
service  conditions  to  support  this  statement  at  a  90%  level  of  confidence.  The 
problem  is  a  standard  one  of  specifying  a  one-sided  confidence  level  and  limit. 
The  desired  lower,  one-sided  confidence  limit  on  the  reliability  is  0.9  and  the 
confidence  level  is  90%.  Combining  Eqs.  (34)  and  (35)  specialized  to  this  case 
yields 

R(t)  >  exp(-(t/2r9)X2  ).  (38) 

a,zr  ' 


Equation  (38)  can  be  rearranged  to  give  the  minimum  MTBF  estimator  9mln 
needed  to  satisfy  the  reliability  specification  at  a  confidence  level  of  1-a. 
Thus 


0  . 
min 


tX 


2 

a,  2r 


2r  (-In  R(t )  ) 


(39) 


Equation  (39)  expresses  the  minimum  observed  MTBF  [via  testing  per  Eq.  (17)] 
needed  to  assure  with  100(l-a)%  confidence  that  reliability  of  at  least  R(t) 
is  achieved  by  an  exponentially  reliable  system  for  a  mission  of  duration  t. 
This  result  depends  on  the  number  of  failures  r  on  which  the  MTBF  estimate  is 
based  and  hence  on  the  level  of  testing  to  which  one  is  willing  to  commit.  Let 
us  return  to  the  specific  example  (1-a)  =  0.9,  R(t)  =  0.9  and  use  Eq.  (39)  to 
plot  (l/t)9mln  as  a  function  of  observed  failures  r.  This  result  is  shown 
in  Fig.  11.  Figure  11  includes  similar  results  for  a  few  other  confidence 
levels  also.  One  can  see  generally  that  if  9  is  supported  by  5  to  10  observed 
failures  the  lower  confidence  limit  must  be  roughly  15  times  the  desired 

mission  duration  for  90%  reliability  at  a  90  to  95%  confidence  level. 


3 .3  A  Components  Versus  System  View 

As  consumers  and  users  of  products  we  usually  take  a  systems  view  of  relia¬ 
bility.  We  ask  "Is  the  car  running?"  rather  than  inquiring  separately  about 
the  operational  health  of  the  tires,  battery,  fuel  pump,  engine  seals,  hydraulic 
and  electrical  subsystems,  etc.  But  as  engineers,  scientists,  and  managers 
charged  with  improving  sonar  transducers  we  need  to  focus  attention  on  specific 
areas  where  constructive  changes  would  have  a  beneficial  impact.  We  tend  to 
think  of  systems  as  assemblages  of  components.  As  we  shall  see  shortly,  this 
posture  probably  relates  more  to  trends  in  commercial  packaging  than  distinctions 
relating  to  form  and  function.  For  example,  a  simple  square  wave  oscillator  might 
be  a  system  of  interest.  It  Is  assembled  from  components  such  as  an  Integrated 
circuit,  resistors,  capacitor,  battery,  switch,  printed  circuit  board,  etc.  The 
capacitor  is  a  component  because  it  is  purchased  as  a  separate  commodity  (No  one 
buys  tin  foil  and  paper  and  rolls  his  own  capacitors  anymore.).  But  to  the 
manufacturer  of  capacitors  this  device  is  itself  a  system  assembled  from  a  variety 
of  more  homogeneous  materials.  Even  in  looking  at  this  indenture  level  we  have 
overlooked  the  processing  steps  required  to  convert  naturally  occurring  raw 
materials  into  the  conductive  foils  and  wires  and  insulating  films  that  go  into 
capacitor  construction. 
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Similarly,  the  integrated  circuit  referred  to  is  purchased,  installed, 
consumed,  and  replaced  as  a  separate  item  (component).  But  with  respect  to 
internal  form  and  function  this  device  is  a  system  of  high  architectural  complexity. 
Many  very  carefully  controlled  masking  and  metallurgical  processing  steps  are 
involved  in  its  manufacture.  The  system  is  complicated  (and  inexpensive)  enough 
to  defy  practical  diagnostics  and  repair.  Systems  which  are  more  expensive  to 
repair  than  replace  tend  to  receive  modular  packaging  and  be  treated  as  throw-away 
Items.  Curiously  then  it  is  economics  and  not  structural  complexity  that  most 
strongly  influences  the  component  versus  system  distinctions  that  we  normally 
draw.  An  automobile  oil  filter  is  a  throw-away  item  not  because  it  is  structurally 
simple  or  complex  but  because  it  is  easier  and  cheaper  to  replace  it  than  to 
clean  and  evaluate  it  for  reuse. 

We  have  just  seen  that  with  respect  to  form  and  function  commercial  components 
may  in  fact  be  exotic  microsystems.  Conversely  many  heroic  structures  have  a 
very  simple  functional  makeup.  Structurally,  a  highway  is  scarcely  more  than  a 
ribbon  of  aggre^te  material.  The  members  of  a  bridge  or  a  barge  are  more 
homogeneous  or  less  functionally  diversified  than  a  simple  capacitor  or  battery. 
Furthermore,  the  former  are  considered  repairable  while  the  latter  are  not. 

Empirical  reliability  studies  tend  to  attempt  to  catalog  components-level 
experience  from  which  systems-level  descriptions  are  built  by  superposition. 

Looking  at  dictionary  definitions,  one  finds  "component"  referred  to  as  a 
constituent  part  while  "system"  means  an  assemblage  of  such  parts.  One  perceives 
that  components  are  to  be  thought  of  as  structurally  simple  while  systems  are 
complex.  Often,  however,  the  reverse  is  true  as  we  have  seen.  We  shall  explore 
further  the  reliability  implications  of  this  structure  dichotomy  in  the  next  two 
sections  of  the  report. 


3.3.1  Complexity  and  Redundancy 

In  the  previous  section  we  have  begun  to  see  that  the  simplicity  or  complexity 
of  a  fabricated  item  is  not  necessarily  related  to  whether  we  regard  the  item  to 
be  a  component,  system,  subsystem,  etc.  Why  do  we  wish  to  make  such  distinctions? 
This  is  because  reliability  is  related  to  features  of  true  form  and  function 
rather  than  to  arbitrary  packaging  and  assembly  constraints.  The  two  important 
reliability  classes  of  interest  here — exponential  and  wearout — are  primarily 
associated  with  functional  nonredundancy  and  redundancy  respectively.  Complex 
systems  usually  (but  not  always)  exhibit  little  redundancy  and  are  exponentially 
reliable.  Simple  structures  tend  to  have  an  excess  of  working  material  and  so 
are  functionally  redundant  in  a  way  that  leads  to  wearout  reliability  or  a  strongly 
peaked  time-to^f ailure  distribution. 

Consider  an  ordinary  steel  tensile  member.  This  may  be  a  modern  marvel 
metallurgically.  But  from  a  reliability  standpoint  it  is  homogeneous  with  many 
interatomic  bonds  sharing  the  applied  load.  If  the  tensile  member  is  conserva¬ 
tively  designed,  many  bond  failures  or  considerable  loss  of  material  (to  abrasion, 
oxidation,  corrosion,  etc.)  can  occur  before  catastrophic  failure  results.  This 
situation  is  in  contrast  to  that  exhibited  by  a  complex  system  the  operation  of 
which  depends  on  the  simultaneous  functioning  of  many  subsystem-structures.  In 
the  latter  case,  many  failure  inodes  whether  they  individually  exhibit  exponential 
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reliability  or  not  contribute  randomly  to  system  unreliability.  As  a  result 
systems  tend  to  be  characterized  by  exponential  or  random  hazard  reliability.  In 
contrast  redundant  structures  tend  to  exhibit  wearout  reliability.  We  can  consider 
then  the  appropriateness  of  exponential  reliability  modeling  for  electronic 
components.  If  such  components  were  simple  redundant  structures,  they  would  be 
expected  to  exhibit  wearout  reliability.  If  they  are  in  fact  complicated 
microsystems,  then  one  would  expect  their  reliability  description  properly  to  be 
exponential.  The  latter  situation  seems  to  be  the  one  observed  and  is  certainly 
the  basis  of  contemporary  handbook  reliability  prediction.  There  is  also  the 
implication  that  components  that  exhibit  true  internal  redundancy  be  separated  by 
class  and  modeled  appropriately  (wearout  reliability). 

It  has  long  been  recognized  that  mechanical  devices  do  not  fit  the  exponential 
modeling  pattern  as  neatly  as  do  electronic  components.  Hopefully,  the  reasoning 
of  the  last  few  paragraphs  provides  some  relevant  insights.  How  then  should 
sonar  transducers  be  properly  treated?  Structurally  they  are  relatively 
uncomplicated.  One  would  look  more  for  wearout  failure  modes  than  random  hazard 
vulnerabilities.  This  approach  is  in  contrast  to  most  transducer  reliability 
modeling  efforts,  which  thus  far  have  attempted  to  apply  the  purely  exponential 
approach  borrowed  from  electronics  reliability. 

The  relationship  of  reliability  to  system  or  component  complexity,  simplicity, 
or  redundancy  features  is  subject  to  various  confounding  influences  in  practical 
situations.  Caution  is  advised  as  suggested  by  the  examples  presented  in  Section 
3.3.2. 

3.3.2  Further  Confusion — Examples 

We  have  already  noted  some  of  the  redundancy  features  of  material  used  in 
bulk.  These  properties  depend  largely  on  configuration,  however.  Consider 
doubling  the  amount  of  material  in  a  tensile  member,  for  example.  If  this  involves 
preserving  the  length  while  doubling  the  section  area,  the  loading  performance 
and  vulnerability  to  damage  are  both  significantly  improved.  If  the  length  is 
doubled  and  the  section  area  preserved,  one  expects  a  slight  worsening  of  tensile 
strength  because  the  probability  of  encountering  a  performance  limiting  flaw  is 
doubled. 

A  capacitor  also  seems  to  be  a  simple  system  involving  a  dielectric  film 
placed  between  conductive  foils.  This  is  not  really  a  bulk  application,  however, 
since  any  single  flaw  in  the  dielectric  can  lead  to  voltage  breakdown  of  the 
device.  Again,  configuration  details  are  important.  If  the  area  of  the  dielectric 
is  doubled  while  its  thickness  is  kept  the  same,  performance  (capacitance)  and 
vulnerability  both  increase.  If  the  dielectric  area  is  fixed  while  the  thickness 
is  doubled,  capacitance  and  susceptibility  to  voltage  breakdown  are  both  reduced 
(assuming  that  the  operating  conditions  are  not  changed).  The  reliability  benefits 
are  due  to  reduced  specific  loading  rather  than  an  ability  to  tolerate  material 
damage.  The  capacitor  remains  a  nonredundant ,  exponentially  reliable  device.  In 
contrast,  the  tensile  member  can  survive  material  damage,  exhibits  redundancy  and 
wearout  reliability. 

Redundancy  can  be  artificially  introduced  into  a  reliability  problem  by 
providing  backup  systems  in  one  form  or  another.  When  this  is  done,  the  overall 
system  will  exhibit  classical  wearout  reliability  even  if  all  the  subsystems 
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Involved  exhibit  purely  exponential  reliability.  Consider  an  example.  Suppose  a 
sonar  array  consists  of  100  transducer  elements  each  taken  to  exhibit  exponential 
reliability  Rp  where 

Re  =  e"  1  .  (40) 

Further,  imagine  that  acceptable  beam  forming  and  acoustic  signal  recovery 
characteristics  are  achieved  if  90  or  more  of  the  100  elements  are  functional 
(This  is  our  system  success/failure  criterion.).  The  probability  of  finding 
exactly  m  failed  units  among  N  total  identical  devices  of  reliability  Rp  is 
binomially  distributed  via 


P 

m 


NJ _ 

m!  (h'-m)! 


r  (N-m)  (j  _R  y 


(41) 


More  explicitly  using  Eq.  (40) 
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Equation  (42)  is  a  discrete  probability  density  function  which  for  all  t 
N 

meets  the  test  [  P^.  ^(Xt)  =  1.  Acceptable  system  performance  is  asso- 

m=0  ’ 

elated  with  the  occurance  of  10  or  fewer  element  failures.  Thus,  we  can 
define  system  reliability  as 
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Equation  (43)  has  the  form  of  a  partial  binomial  sum  [hence  the  notation 
PBS(N,m,t)].  The  system  reliability  given  by  Eq.  (43)  for  the  case  N  =  100, 
m  =  10  is  plotted  in  Fig.  12a.  The  time-to-failure  probability  density  function 
associated  with  Eq.  (43)  is  given  by  fs(t)  =  -dRs/dt  [Eq.  (3c)].  Again,  using 
Eq.  (40)  and  performing  the  indicated  differentiation  we  find 


fs(t)  =  (N-m)  XPjj  >tn(X  t)  . 


(44) 


This  function  is  a  continuous  probability  density  function  satisfying 


►  00 

fs(t)dt  =  1 . 
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Equation  (44)  is  plotted  in  Fig.  12b  for  the  case  of  interest  (N  =  100,  m  =  10). 
Comparing  Figs.  12  and  3  or  4  we  see  that  the  transducer  array  exhibits  reliability 
features  characteristic  of  wearout.  This  is  purely  the  result  of  allowing  component 
redundancy  in  the  system  success  criterion.  The  transducer  elements  themselves 
were  taken  to  have  exponential  reliability. 


Let  us  consider  an  example  of  just  the  opposite  situation — a  case  where 
components  subject  to  wearout  alone  lead  to  exponential  system  reliability. 
Bazovsky^  has  treated  just  such  a  problem  in  examining  the  impact  of  making 
replacements  only  as  failures  occur  within  a  population  of  10,000  light  bulbs. 
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In  the  interest  of  overcoming  a  slight  oversimplification  in  Bazovsky's^  treat¬ 
ment,  let  us  reanalyze  a  similar  example.  Consider  a  group  of  N  incandescent 
lamps  subject  to  wearout  failures  only.  Take  the  time-to-f allure  distribution 
to  be  normal  centered  at  u  with  standard  deviation  o  as  shown  in  Fig.  13a.  We 
imagine  that  as  each  lamp  fails  it  is  replaced  by  a  new  one.  Thus,  when  all  the 
original  population  has  dropped  out  of  service  it  has  been  replaced  by  a  second 
generation  of  lamps.  However,  these  units  have  been  placed  in  service  over  a 
range  of  times  rather  than  all  at  once.  As  a  result,  the  time-to-failure  dis¬ 
tribution  will  now  show  greater  dispersion  than  that  due  to  wearout  effects 
alone.  This  convolution  problem  is  analyzed  in  Appendix  C.  The  dispersion 
effects  continue  to  grow  with  each  replacement  generation.  Soon  different  lamp 
generations  are  represented  at  the  same  time.  This  situation  is  shown  in  Fig. 

13b.  The  superposition  of  the  individual  time-to-failure  distributions  represent¬ 
ing  various  lamp  generations  is  depicted  in  Fig.  13c.  This  function  oscillates 
at  first  and  then  gradually  settles  to  a  constant  value  of  N/u.  This  equilibrium 
failure  rate  is  usually  unacceptablv  high.  Thus,  one  cannot  ordinarily  tolerate 
replacing  wearout  failures  as  the>  ~ur .  Rather,  it  is  much  more  productive  to 
anticipate  wearout  (via  pilot  studies)  and  engage  in  preventive  maintenance  a 
few  time-to-failure  standard  deviations  before  the  mean  wearout  life  p. 

In  the  example  just  discussed  a  constant  failure  rate  develops  because  the 
replacement  strategy  invoked  leads  to  units  being  placed  in  service  at  random 
times.  This  has  the  effect  of  completely  masking  the  intrinsic  wearout  charac¬ 
teristic  assumed  to  be  operative.  Another  implication  of  the  constant  failure 
rate  that  develops  is  that  the  population  will  decay  exponentially  if  the  replace¬ 
ment  program  is  abandoned.  This  is  an  example  of  exponential  reliability  associ¬ 
ated  with  purely  wearout  failures.  Contriving  to  display  exponential  properties 
within  a  pure  wearout  situation  is  not  merely  a  pedantic  exercise.  Multiple 
test  stand  replacement  testing  is  often  carried  out  in  evaluating  the  performance 
of  exponentially  reliable  units.  One  is  cautioned  to  observe  the  distribution 
of  individual  times  to  failure  as  well  as  the  total  number  of  failures  and 
total  test  time.  This  permits  the  confirmation  of  a  true  random  hazard  situation 
and  avoids  confusion  with  the  case  where  wearout  units  are  placed  in  service  at 
random  times.  Similar  concerns  arise  in  servicing  commercial  equipment.  Expe¬ 
riencing  a  constant  replacement  rate  for  a  particular  component  does  not  alone 
determine  whether  the  failures  Involved  are  of  the  random  or  wearout  type. 
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4.0 


RELIABILITY  PREDICTION 


Reliability  prediction  is  an  exercise  that  one  engages  in  prior  to  committing 
to  the  production  of  new  hardware.  The  purpose  of  it  Is  to  estimate  the  probable 
survival  characteristics  of  the  equipment  against  mission  objectives.  A  relevant 
dictionary  definition  states  that  to  predict  is  to  foretell  with  precision  of  cal¬ 
culation,  knowledge,  or  shrewd  inference  from  facts  or  experience.  Thus  a  solar  or 
lunar  eclipse  may  be  predicted  by  measuring  the  relative  positions  of  the  sun, 
earth,  and  moon;  discovering  the  laws  that  describe  their  motion;  and  calculating 
trajectories  for  future  time.  Within  the  disciplines  of  physics  and  astronomy  all 
of  this  has  been  elegantly  accomplished.  On  the  celestial  scale  position  and  time 
can  be  simultaneously  measured  to  high  precision.  The  problem  is  also  well  charac¬ 
terized  by  considering  only  gravitational  forces.  Astronomical  prediction  is 
considered  to  be  mature,  satisfying,  and  successful. 

In  a  sense  the  philosophy  of  reliability  prediction  is  the  same  as  that  of 
any  predictive  science — inferring  some  future  behavior  from  past  observations. 
Reliability  prediction  is  also  different  in  some  ways  than  areas  such  as  astro¬ 
nomical  prediction.  In  reliability  work  one  is  not  generally  concerned  with 
dynamics — evolution  from  an  observable  initial  state  via  discoverable  laws  of 
behavior.  (An  exception  to  this  statement  is  provided  by  the  related  area  of 
failure  analysis.)  Reliability  prediction  usually  attempts  to  draw  inferences 
from  similarities  of  a  system  of  interest  to  hardware  previously  evaluated.  The 
complex  conditions  of  environment  and  use  make  more  detailed  treatments  of  relia¬ 
bility  prediction  problems  very  difficult.  One  can  even  argue  that  some  kinds 
of  reliability  problems  do  not  exhibit  a  failure  dynamics  of  much  interest.  For 
example,  in  the  random  hazard  situation  one  is  interested  in  postponing  catastrophic 
loss  of  function  rather  than  examining  its  (rapid)  development  in  time  when  it 
does  occur.  In  contrast,  of  course,  the  dynamics  of  wearout  phenomena  are  a 
major  issue. 

From  earlier  sections  of  the  report  we  have  seen  that  even  under  the  most 
ideal  conditions  reliability  information  may  be  expected  to  exhibit  a  highly 
distributed  character.  Of  course  astronomical  observations  are  also  distributed 
although  dispersion  effects  are  often  much  less  significant  giving  the  latter 
an  appealing  flavor  of  determinism.  This  difference  is  not  due  to  the  inability 
of  reliability  studies  to  attract  intellectual  giants  to  play  the  roles  Brahe, 
Calileo,  Kepler,  and  Newton  did  for  astronomy.  Neither  is  it  the  result  of  a 
lack  of  substantial  and  sustained  funding.  Reliability  prediction  is  an  awkward 
and  difficult  discipline  because  of  the  diversity  of  objects  of  interest,  the 
variability  of  environmental  and  use  conditions,  and  difficulties  in  defining 
process  endpoints.  Nevertheless  when  the  economic  benefits  of  improved  product 
performance  are  considered,  reliability  studies  are  found  to  be  very  cost  effective. 

In  the  following  sections  we  consider  briefly  several  relevant  aspects  of 
reliability  prediction. 

4.1  Original  Impetus 

Modern  reliability  studies  as  a  formal  discipline  are  generally  taken 
to  have  originated  with  the  Cerman  V  rocket  programs  of  World  War  II.  Early 
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thoughts  were  that  quality  of  manufacture  would  be  a  secondary  Issue  In  a  device 
Intended  to  see  service  of  only  some  tens  of  minutes.  The  Incorrectness  of  this 
line  of  reasoning  was  emphasized  by  the  failure  of  virtually  all  of  the  rockets 
initially  built.  It  was  then  recognized  that  to  have  even  a  moderate  chance  of 
performing  satisfactorily,  a  complicated  system  must  be  built  of  very  highly 
reliable  components.  The  reliability  of  a  system  without  redundancy  Is  the  product 
of  the  component  reliabilities.  Learning  this  lesson  turned  out  to  be  one  of 
the  prerequisites  to  entering  the  space  age. 

When  reliability  concepts  began  to  be  Introduced  in  connection  with  military 
procurements  in  this  country,  one  of  the  first  aggravating  dispersion  effects  in 
this  field  surfaced.  Different  contractors  bidding  on  the  same  job  would  predict 
substantially  different  reliabilities  for  their  versions  of  the  desired  product. 
Complicating  the  situation  was  that  these  conclusions  were  developed  by  using  a 
variety  of  different  unstandardized  sources  of  supporting  information.  The  climate 
was  one  that  did  not  permit  easy  evaluation  of  the  relative  merits  of  competing 
proposals  for  the  same  work.  Thus  a  major  interest  of  the  government  in  supporting 
the  development  of  universal  reliability  prediction  methods  was  to  put  competing 
contractors  on  equal  terms.  If  all  bidders  were  using  the  same  comprehensive 
source  of  reliability  data  in  the  same  way,  it  was  reasoned  that  superior  predicted 
reliability  would  be  a  reflection  of  a  better  hardware  design.  Ambitious  as  it 
sounds  such  a  scheme  has  been  Implemented.  We  now  have  a  variety  of  military 
standards,  handbooks,  and  procedures  in  place  providing  instruction  for  the  uniform 
disposition  of  reliability  questions  relating  to  procurements.  Putting  competing 
contractors  on  an  equal  footing  has  been  pretty  well  accomplished.  In  fact  the 
heroic  and  well  maintained  edifice  of  reliability  prediction  tools  has  become 
so  thoroughly  entrenched  that  its  users  have  largely  forgotten  its  origins.  There 
is  a  tendency  among  procurement  managers  to  view  reliability  prediction  as  a  mature 
and  promising  approach  to  obtain  the  kinds  of  answers  they  need  for  solving  hard¬ 
ware  supply  problems.  Reliability  studies  are  beneficial  but  they  often  fall  short 
of  the  expectations  people  outside  the  field  have  for  them. 

We  have  already  seen  that  reliability  prediction  has  succeeded  in  a  relative 
way  by  equalizing  the  procurement  process.  Not  much  work  has  been  done  In  evaluat¬ 
ing  the  absolute  success  (How  well  does  prediction  compare  with  measurements  on  the 
same  equipment?)  of  reliability  prediction.  In  one  case  that  has  come  to  this 
author's  attention  involving  avionics  radio  equipment,  prediction  of  the  system 
MTBF  produced  values  ranging  from  2  to  10  times  larger  than  those  subsequently 
measured. ^  Clearly  this  level  of  correlation  does  not  allow  prediction  to  be 
substituted  for  actual  In-service  measurements  if  a  meaningful  hardware  evaluation 
Is  desired. 


4 . 2  Current  Practice 

As  presently  implemented,  reliability  prediction  generally  takes  one  of  two 
basic  forms.  The  standard  handbook  approach  is  very  commonly  used  in  the  military 
hardware  procurement  setting  for  which  it  was  developed.  The  other  major 
reliability  discipline  that  lends  itself  to  performance  prediction  is  called 
probabilistic  design.  From  the  user's  viewpoint  handbook  reliability  prediction 
represents  a  sort  of  cookbook  approach  to  the  problem.  Probabilistic  design  is 
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more  analytical,  distributional  in  focus,  and  academic  in  flavor.  One  should  not 
feel  that  one  approach  is  correct  and  the  other  isn't.  Handbook  prediction  at 
times  seems  to  lack  rigor  but  nonetheless  can  be  applied  to  rather  complicated 
problems.  Probabilistic  design  can  be  quite  definitive  for  small  scale  problems 
but  often  is  prohibitively  difficult  or  preempted  by  information  gaps  in  larger 
settings.  We  will  look  further  at  some  of  the  basic  features  of  these  two 
approaches  in  the  next  two  subsections  of  the  report. 


4.2.1  Handbook  Methods 


Under  the  handbook  methods  heading  we  will  limit  discussion  to  topics  dealt 
with  in  the  dominant  source  work  in  this  field — MIL-HDBK-2 17C.^  The  basic  nature 
and  use  of  handbook  techniques  and  information  will  be  considered.  It  is  also 
well  to  be  aware  of  the  proper  scope  and  limitations  of  handbook  prediction. 

Reference  7  addresses  the  latter  point  in  Section  1.3.  Electronic  components  and 
systems  are  taken  in  this  setting  to  be  exponentially  reliable.  Thus  failure 
rates  are  additive  and  time  independent. 

MIL-HDBK-2 17C  is  the  latest  revision  of  the  most  definitive  document  relating 
to  the  problem  of  correlating  the  observed  and  expected  behavior  of  important 
classes  of  electronic  systems  and  components.  It  summarizes  in  tabular  form  vast 
quantities  of  data  accumulated  under  actual  field  service  conditions.  Most  of 
MIL-HDBK-2 17C  treats  a  reliability  prediction  method  called  "Part  Stress  Analysis". 
This  is  a  rather  detailed  kind  of  description  requiring  complete  design  and 
operating  information  relating  to  the  system  of  interest.  Implementing  this  approach 
requires  one  to  know  a  great  deal  about  prevailing  thermal  conditions,  electrical 
loading,  and  the  service  environment  generally.  Assuming  these  identifications 
can  be  made,  the  handbook  provides  associated  failure  rate  information  either  in 
tabular  or  analytic  form.  Generally  speaking,  the  desired  reliability  information 
is  structured  as  a  base  failure  rate  modified  by  environment,  quality,  use,  etc. 
factors.  The  base  failure  rate  incorporates  temperature  and  primary  electrical 
effects  and  is  specific  to  device  category.  The  modifiers  are  multiplicative 
quantities  called  n-factors.  In  virtually  all  component  categories  the  environ¬ 
mental  and  quality  factors  ng  and  ttq  appear.  A  variety  of  other  ir-factors  generally 
also  occur.  For  example,  the  part  failure  rate  model  for  general  purpose  diodes 
is  expressed  as 

Ip  =  Xb  (  tte  x  ttq  x  ttr  x  tta  x  H52  x  ng)  failures/106  hours.  (45) 


The  subscripts  R,  A,  S2  ,  and  C  refer  to  current  rating,  application,  voltage 
stress,  and  construction  respectively.  The  base  failure  rate  for  discrete  semi¬ 
conductors  (including  diodes)  is  represented  as 
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A  is  a  scaling  factor. 

N'T.  Tm,  and  P  arc  shaping  parameters. 

T,  AT,  and  S  are  temperature  and  thermal  and  electrical 
stress  derating  factors. 

The  hase  failure  rate  and  the  ir-factors  for  each  device  category  treated  are 
all  presented  in  M1L-HDBK-2 17C.  As  an  example  of  the  range  of  variation  possible, 
the  environmental  and  quality  ^-factors  for  general  purpose  diodes  are  presented 
in  Table  II.  The  usual  eleven  environmental  stress  levels  and  five  levels  of 
component  quality  are  displayed.  One  can  see  from  considering  these  two  factors 
alone  that  corrections  to  the  base  failure  rate  can  be  hundreds  of  times  larger  than 
the  base  rate  itself. 

MI L-HDBK-2 1 7C  displays  a  variety  of  analogs  to  Eq.  (45)  relating  to  electronic 
components  other  than  diodes.  The  versatility  in  this  form  of  description  comes 
in  the  introduction  of  the  wide  range  of  ir-factors  relating  to  diverse  properties 
affecting  the  performance  of  different  classes  of  components.  Similarly  Eq.  (46) 
is  only  a  representative  case.  Other  models  are  given  in  Ref.  7  relating  to  dif¬ 
ferent  situations.  In  reliability  work  descriptions  such  as  Eq.  (46)  are  called 
"physics-of-failure"  models.  To  a  physicist  this  language  is  a  little  heady, 
suggesting  model  development  based  on  derivations  from  first  principles.  Actually 
failure  rate  models  should  more  properly  be  thought  of  as  phenomenological 
characterizations.  Forms  have  been  developed  that  with  the  adjustnent  of  relatively 
few  parameters  allow  a  large  amount  of  field  experience  to  be  cataloged  and 
summarized.  There  is  no  need  to  apologize  for  this  situation.  Thermodynamics 
is  largely  a  phenomenological  discipline.  The  latter  also  has  a  proper  microscopic 
basis  in  statistical  mechanics,  of  course.  One  can  think  in  terms  of  exploring 
reliability  problems  more  fundamentally  with  a  view  toward  correlating  cause  and 
effect.  This  is  the  failure  analysis  approach  which  occasionally  is  invaluable. 

It  must  be  used  sparingly,  however,  in  order  to  keep  the  scope  of  the  overall 
problem  within  tractable  bounds. 

The  electronic  components  for  which  handbook  reliability  data  sources  have 
been  developed  are  treated  by  generic  class.  There  are  many  hundreds  of  junction 
transistor  types  that  carry  distinct  part  numbers.  These  are  not  distinguished 
for  handbook  reliability  purposes — they  are  all  simply  Group  1  discrete  semicon¬ 
ductors.  Obviously  then  handbook  reliability  data  is  class  average  information. 

In  the  handbook  setting  measures  of  dispersion  within  classes  are  never  reported. 
Similarly  the  user  is  never  made  aware  of  how  much  test/service  time  and  how 
many  observed  failures  support  reported  failure  rates.  Thus  one  is  not  in  a 
position  of  being  able  to  make  statements  about  the  statistical  quality  of  hand¬ 
book  prediction.  This  is  consistent  with  the  view  of  the  authors  of  MIL-HDBK-2 17C 
who  disclaim  the  absolute  accuracy  of  handbook  predictions  while  maintaining 
their  relative  usefulness  in  the  parallel  procurement  setting.  Obviously  the 
handbook  user  is  not  being  misinformed.  One  can  argue,  however,  that  he  is  left 
seriously  uninformed  by  a  method  that  suppresses  and  fails  to  pass  on  available 
dispersion  information. 

Thus  far  in  this  section  we  have  discussed  the  Part  Stress  Analysis,  or 
more  detailed  type  of  handbook  reliability  description.  Its  implementation  calls 
for  a  mature  system  design  and  rather  complete  knowledge  of  component  electrical 
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stresses,  power  dissipation,  thermal  and  mechanical  loads,  etc.  During  the 
preparation  and  evaluation  of  contractor  proposals  and  early  product  design 
phases,  much  of  this  information  is  not  available.  In  this  setting  the  "Parts 
Count"  reliability  prediction  method  is  often  employed.  Here  one  need  only 
identify  components  by  generic  type,  quantities,  quality  levels,  and  the  intended 
operating  environment.  The  total  equipment  failure  rate  is  given  by  evaluating 
the  sum 


n 

X EQUIP  =  Ni  (XGVi  ’ 

where 

=  failure  rate  for  the  i*-*1  generic  part 
tTq  =  quality  factor  for  the  i1*1  generic  part 
=  quantity  of  itfl  generic  part 
n  =  number  of  different  part  categories. 


(47) 


Equation  (47)  applies  to  a  single  operating  environment.  If  different  sections 
of  the  equipment  operate  in  different  environments,  then  partial  sums  of  the  form 
of  Eq.  (47)  should  be  performed  on  a  per -opera ting -environment  basis  and  added. 
The  parameters  of  Eq.  (47)  are  tabulated  in  Section  3.0  of  MIL-HDBK-2 17C.  Testi¬ 
mony  to  the  relatiwa  simplicity  of  this  approach  is  that  exposition  of  the  method 
and  presentation  of  all  the  supporting  material  requires  only  10  pages  of  text. 


If  as  we  have  seen  the  accuracy  of  Part  Stress  Analysis  reliability  prediction 
is  suspect,  then  it  is  necessary  to  approach  Parts  Count  results  with  still  great¬ 
er  skepticism.  This  is  true  because  a  great  deal  of  relevant  information  relating 
to  use  conditions  is  simply  not  available  at  this  stage.  Caution:  If  a  contractor 
offers  to  just  barely  meet  mission  reliability  objectives  on  the  basis  of  a  Parts 
Count  prediction,  let  the  buyer  beware.  Similarly,  it  is  folly  to  contemplate 
substituting  any  form  of  prediction  for  a  bona  fide  post-manufacture  reliability 
verification  study  if  one  really  wants  to  properly  characterize  hardware  performance. 

4.2.2  Probabilistic  Design 

Probabilistic  design  refers  to  a  developing  method  of  approaching  reliability 
and  related  engineering  design  problems  that  emphasizes  their  statistical  aspects. 
Every  measurable  engineering  parameter  is  taken  to  be  distributed  rather  than 
deterministic  (having  a  single  value  only).  The  performance  of  an  entity  of 
interest  is  described  in  terms  of  the  stresses  it  is  subjected  to  and  its  strength 
or  ability  to  function  in  a  giwan  stress  environment.  In  quantifying  this  approach 
strength  is  defined  simply  as  the  stress  level  at  which  failure  occurs.  These 
terms  are  used  here  in  a  generalized  sense.  Thus  stress  may  be  electrical,  ther¬ 
mal,  mechanical,  hygroraetric,  etc. — any  relevant  loading  aspect  of  the  situation 
of  interest.  Failure  must  also  be  adequately  defined  whether  it  be  catastrophic, 
onset  of  irreversible  damage,  or  some  specified  property  degradation.  Against 
this  background  reliability  is  defined  as  the  probability  that  strength  exceeds 
stress.  For  a  system,  of  course,  one  has  to  ask  this  question  simultaneously 
about  every  relevant  stress/strength  facet.  The  key  to  performing  probabilistic 
design  is  to  properly  characterize  the  strength  distributions  of  a  piece  of 
hardware  and  also  identify  from  a  distributional  viewpoint  the  stresses  operative. 
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Kerecioglu0  gives  a  detailed  fifteen-step  methodology  charting  how  one  might 
systematically  grapple  with  a  probabilistic  design  problem.  Reference  3  also 
treats  this  topic  in  Chapter  4  with  some  minor  variations  from  the  approach  pre¬ 
sented  in  Ref.  8.  The  topical  areas  of  Kececioglu's  probabilistic  design  meth¬ 
odology  are  listed  in  Table  III.  It  is  beyond  the  proper  scope  of  this  report 
to  attempt  to  convey  to  the  reader  a  working  appreciation  of  the  probabilistic 
design  method.  However,  an  effort  will  be  made  to  elucidate  the  underlying 
philosophy  of  the  approach.  Probabilistic  design  is  basically  a  very  detailed 
stress/strength  overlap  calculation.  Reliability  is  simply  the  probability  that 
strength  exceeds  stress  under  the  conditions  of  the  intended  application.  One 
can  see  from  Table  III  that  reliability  prediction  is  one  facet  of  probabilistic 
design.  Normally,  however,  the  focus  is  not  on  prediction  of  reliability  but  is 
directed  toward  tailoring  design  parameters  to  achieve  a  desired  performance 
objective.  In  either  case  the  price  of  measuring  success  can  be  quite  high. 
Probabilistic  design  is  a  demanding  discipline  in  terms  of  the  quantity  and  qual¬ 
ity  of  informational  inputs  required. 

One  should  expect  the  probabilistic  design  approach  to  reliability  questions 
to  be  ultimately  compatible  with  relevant  phenomenological  descriptions.  For 
example,  exponential  reliability  is  implied  by  a  situation  involving  static  and 
somewhat  overlapping  distributions  of  stress  and  strength.  This  leads  to  a  con¬ 
stant  vulnerability  or  failure  probability  per  unit  time  or  per  load  cycle.  In 
contrast  wearout  is  characterized  by  a  monotonic  loss  of  strength  due  to  either 
fatigue  under  load  or  dissipative  Influences  of  the  service  environment.  This 
increases  the  interference  of  stress  and  strength  distributions  leading  to  an 
increasing  probability  that  additional  service  will  result  in  failure.  A  specif¬ 
ic  example  of  corrosion  wearout  is  treated  probabilistically  in  Section  5.4. 

The  early  failure  situation  is  also  easily  interpreted  from  the  probabilistic 
design  viewpoint.  In  this  case  the  initial  strength  distribution  Is  skewed  to 
the  left  embracing  substandard  components.  Application  of  normal  service  stresses 
leads  to  significant  stress/strength  overlap  and  a  high  probability  of  premature 
loss  of  function.  In  this  case  failures  may  also  occur  in  transit  or  otherwise 
prior  to  being  placed  in  service. 


4.3  Limitations  in  the  Sonar  Context 


We  have  just  discussed  the  motivations  for  and  some  of  the  major 
developments  in  the  area  of  reliability  prediction.  In  recognition  of  the 
importance  of  product  reliability  and  the  successes  of  reliability  studies  in 
other  areas,  the  Navy  sonar  community  has  taken  steps  to  systematically  improve 
sonar  hardware  through  efforts  having  reliability  as  a  specific  focus.  This 
kind  of  commitment  has  already  produced  beneficial  results.  Thus  far,  however, 
the  benefits  have  been  largely  of  a  debugging  nature — discovery  of  overt  design 
or  manufacturing  defects — rather  than  the  optimization  of  designs  already 
established  as  workable.  To  those  who  felt  that  reliability  prediction  was 
already  mature  science  (or  art),  recent  progress  in  the  sonar  area  has  seemed 
painfully  slow.  There  are  several  reasons  that  this  should  be  so.  Much  sonar 
transducer  reliability  prediction  work  attempts  a  description  from  a  components- 
level,  random  hazard  point  of  view.  This  suffers  from  certain  weaknesses.  Sonar 
transducers  are  not  assembled  exclusively  from  components  that  are  properly 
characterized  by  constant  hazard  functions.  Wearout  processes  such  as  metal 
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fatigue  and  corrosion  and  water  permeation  of  elastomeric  materials  are  also 
operative.  Many  components  are  nonstandard  from  a  reliability  accounting  point 
of  view  so  that  handbook  failure  rate  source  materials  don't  apply.  This  places 
the  reliability  data  acquisition  task  in  the  hands  of  transducer  production  con¬ 
tractors  or  even  the  end  user — the  Navy. 

A  primary  purpose  of  this  report  is  to  help  the  reader  cultivate  an  appreci¬ 
ation  of  the  enormity  of  the  task  of  gathering  meaningful  reliability  data  for 
transducer  systems.  Why  do  transducers  pose  a  particularly  difficult  problem? 
Several  reasons.  Transducers  are  intended  to  be  long  lived  and  opportunities 
for  observation  and  maintenance  of  installed  units  are  few  and  inconvenient. 
Transducers  are  a  specialty  item  and  production  quantities  are  usually  quite 
limited.  These  two  factors  combine  to  make  it  very  difficult  to  gather  together 
enough  units  to  do  statistically  significant  reliability  testing.  An  even  greater 
challenge  is  to  produce  reliability  results  in  a  timely  fashion — when  they  can 
impact  the  hardware  involved  during  design  and  development  stages. 

To  this  author's  knowledge  a  complete,  integrated  transducer  study  advertised 
to  be  a  probabilistic  design  evaluation  has  never  been  attempted.  And  yet  many 
of  the  elements  of  a  probabilistic  design  study  are  routinely  developed  by  trans¬ 
ducer  acoustic  design  specialists  and  production  engineers — persons  whose  focus 
is  more  on  performance  than  reliability  per  se.  Dynamic  stress  analyses  of  driven, 
mass-loaded  piezoelectric  ceramic  and  fatigue  loading  studies  of  stress  rods  and 
pressure  release  systems  are  examples.  There  are  many  case  histories  where  this 
kind  of  evaluation  has  led  to  design  changes  or  manufacturing  adjustments  asso¬ 
ciated  with  dramatic  transducer  reliability  improvements.  Significant  progress 
is  possible  and  has  been  achieved  in  areas  where  operational  stresses  and  the 
strengths  of  component  materials  employed  are  both  well  characterized.  The 
difficult  situations  are  those  where  the  properties  of  the  materials  involved 
change  with  time  and  temperature  and  perhaps  loading  history  and  the  stresses 
operative  have  environmental  origins  and  exhibit  large  fluctuations.  Well 
developed  mechanistic  stress/strength  overlap  interpretations  of  phenomena  such 
as  water  per'  ation,  corrosion,  and  bond  degradation  have  not  yet  been  given. 

These  are  areas  known  to  be  important  and  hardware  life-limiting  in  many  situa¬ 
tions.  This  provides  a  strong  incentive  but  does  not  otherwise  simplify  the 
large  task  of  assembling  distributional  information  needed  for  probabilistic 
evaluation  of  these  highly  variable  processes.  Specific  areas  of  difficulty  are 
discussed  in  greater  detail  in  subsequent  sections  of  the  report. 
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5.0 


LIFE  PREDICTION 


In  connection  with  characterizing  the  serviceability  of  hardware,  the  term 
life  refers  to  the  entire  period  of  useful,  failure-free  operation.  Life  as  a 
descriptive  parameter  Is  dual  to  reliability.  Reliability  is  the  probability  of 
realizing  acceptable  performance  for  a  given  period  of  time  (some  mission  duration 
or  the  period  between  overhauls  for  example).  Life  Is  the  actual  time  that  such 
performance  Is  achieved.  Many  similar  equipments  can  be  monitored  to  obtain  a 
distribution  of  lifetimes.  Formally  from  F.qs.  (2)  and  (6)  of  Table  1,  reliability 
Is  one  minus  the  cumulative  of  the  distribution  of  lifetimes.  Conversely  from 
Eq.  (3c)  the  distribution  of  equipment  lifetimes  Is  equal  to  the  negative  time 
derivative  of  the  reliability  function.  As  with  reliability  notice  the  distribu¬ 
tional  flavor  when  we  speak  of  hardware  service  life.  We  never  inquire  how  long  a 
specific  item  will  continue  to  operate.  Rather  we  ask  about  Its  expected  life  or 
the  average  or  most  probable  lives  of  similar  equipments.  These  are  measures  of 
central  tendency  of  some  body  of  distributed  information.  If  we  are  sophisticated, 
we  also  look  Into  the  dispersion,  asymmetry,  etc.  of  the  distribution  when  this 
seems  justified  by  the  quality  of  sampling  statistics. 

Since  service  life  and  reliability  are  distributionally  related,  life 
prediction  and  reliability  prediction  are  really  equivalent  exercises.  For 
example,  in  doing  handbook  reliability  prediction  one  arrives  at  a  superposition 
failure  rate,  inverts  it  to  obtain  the  mean  time  between  failures  (MTBF),  and 
exponentiates  the  (negative)  failure  rate  times  time  to  obtain  reliability.  Thus 
conjugate  reliability  and  life  Information  is  developed  simultaneously.  If 
reliability  and  life  are  so  closely  related,  why  do  we  address  the  two  as  separate 
topics  In  this  report?  This  is  a  good  question.  Within  the  context  of  reliability 
theory  the  separation  seems  unnatural.  But  when  practical  concerns  are  raised 
the  reverse  is  true.  Basically,  whether  equipment  Is  used  in  a  military, 
commercial,  or  consumer  setting,  one  is  Interested  in  two  things — how  well  will 
the  hardware  function  and  for  how  long?  The  answers  to  these  questions  enable  us 
to  determine  whether  mission  requirements  will  be  met  and  what  maintenance  and 
replacement  costs  and  schedules  will  be.  How  well  does  equipment  function  (over 
some  specified  time  interval)?  This  is  reliability.  How  long  does  it  continue 
to  work?  This  is  life. 

The  reliability/life  dichotomy  sorts  itself  out  somewhat  when  we  distinguish 
random  hazard  and  wearout  effects.  In  the  latter  case  times  to  failure  often 
tend  to  cluster  so  that  any  measure  of  central  tendency  is  a  reasonably  descriptive 
service  life  estimate.  In  contrast  for  the  random  hazard  situation  MTBF  is  a 
poor  measure  of  the  broad  exponential  distribution  of  times  to  failure  that  one 
expects  to  encounter.  A  much  crisper  description  is  obtained  by  specifying  the 
probability  of  surviving  a  mission  of  given  duration,  i.e.,  the  reliability.  One 
can  further  interpret  the  random  hazard  situation  as  exhibiting  a  constant  overlap 
of  the  associated  stress  and  strength  distributions  or  an  invariant  vulnerability 
to  catastrophic  damage  due  to  extreme  load  fluctuations.  The  reliability  parameter 
is  a  measure  of  this.  Wearout,  on  the  other  hand,  is  characterized  by  the 
accumulation  of  damage  until  residual  strength  is  commensurate  with  load  stresses 
encountered  in  normal  service.  Failure  is  inevitable  and  often  with  a  very 
predictable  time  scale. 
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Often  a  system  of  interest  exhfhits  random  hazard  and  wearout  effects 
simultaneously.  These  mav  be  independent  or  strongly  coupled.  For  example,  a 
capacitor  may  manifest  an  exponential  voltage  breakdown  reliability  characteristic 
and  suffer  fatigue  wearout  failures  with  thermal  cycling.  The  former  is  a  measure 
of  the  intrinsic  resistance  of  the  dielectric  material  to  perforation  under 
overvoltaging  conditions.  The  latter  may  be  due  to  improper  lead  dressing 
resulting  in  excessive  flexure  in  service.  These  two  effects  are  unrelated  and 
we  mav  speak  of  both  a  (voltage  breakdown)  reliability  and  a  (mechanical  fatigue) 
wearout  life.  The  two  effects  can  also  be  treated  together  in  terms  of  either 
reliability  or  life  concepts. 

An  automobile  tire  is  an  example  of  a  system  which  exhibits  coupled  random 
hazard  and  systematic  or  wearout  reliability  aspects.  The  stress  environment  is 
due  to  ordinary  road  hazards — stones,  chtickholes,  railroad  tracks,  etc.  However, 
the  strength  of  the  tire  or  its  ability  to  survive  exposure  to  the  stress 
environment  without  sustaining  damage  is  not  static.  Rather  the  strength  decreases 
as  tread  material  is  worn  away  in  normal  service.  Thus,  the  random  hazard 
vulnerability  increases  as  the  wearout  process  proceeds.  And  of  course  if  normal 
preventive  maintenance  steps  were  eschewed  in  this  case,  the  tire  would  eventually 
succumb  to  a  programmed  wearout  failure  due  to  a  random  load  stress. 

The  practice  of  speaking  of  reliability  and  life  as  if  they  were  unrelated 
may  stem  from  formulating  separate  treatments  of  the  random  hazard  and  wearout 
aspects  of  reliability  problems.  Life  prediction  must  then  emphasize  wearout 
considerations.  Let  us  bear  this  distinction  as  well  as  the  formal  unity  of  the 
subject  matter  in  mind  as  we  further  explore  the  life  prediction  problem  in  the 
following  sections  of  the  report. 

5 . 1  Definition  of  Life 

We  have  already  noted  that  life  in  reference  to  hardware  is  the  period  of 
useful,  failure-free  operation.  For  reliability  evaluation  purposes  it  is  often 
necessary  to  be  very  specific  about  what  constitutes  acceptable  performance. 

This  may  be  easily  accomplished  such  as  in  the  case  of  an  incandescent  lamp  for 
home  lighting.  If  it  lights  when  voltage  is  applied,  it  is  good;  otherwise  it  is 
considered  failed.  In  the  electric  light  case  the  transition  between  good  and 
failed  is  usually  quite  abrupt  corresponding  to  the  evaporation  of  a  portion  of 
an  old  and  weakened  filament.  Most  hardware  evaluation  situations  are  more 
complicated  and  subtle.  In  a  photo-processing  application  our  electric  lamp  may 
have  to  be  discarded  when  its  intensity  or  spectral  output  fall  outside  acceptable 
limits  rather  than  when  the  filament  disintegrates.  These  two  cases  are  examples 
of  general  classes  of  failure  criteria:  the  sudden  and  complete  loss  of  some 
physical  function  or  the  gradual  migration  of  a  performance  property  outside  the 
normal  useful  range.  In  complicated  equipment,  of  course,  many  subsystems  must 
simultaneously  meet  appropriate  performance  tests.  The  more  restrictive  such 
situations  are,  the  more  difficult  it  is  to  be  assured  in  practice  that  in-use 
equipment  is  performing  adequately.  For  example,  in  sonar  applications  terminal 
resistance  or  drive  power  measurements  do  not  provide  detailed  information  on 
transducer  efficiency  or  array  beam-forming  characteristics. 

We  have  spoken  of  failure-free  operation  as  defining  the  period  of  useful 
equipment  life.  This  does  not  mean  that  no  failures  can  be  tolerated.  Obviously 
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when  a  repair  is  effected,  equipment  is  revitalized  and  can  be  returned  to  service. 

One  might  be  tempted  in  some  circumstances  to  define  service  life  to  be  that 
period  of  time  when  the  hardware  reliability  remains  above  some  specified  level. 
This  turns  out  to  be  circular  since  reliability  is  related  to  the  time-to-fai lure 
probability  density  function  or  distribution  of  service  lifetimes.  Thus,  it  is 
necessary  for  individuals  who  draw  up  equipment  specifications  to  decide  very 
specifically  what  performance  requirements  need  to  be  imposed.  Reliability  and 
service  life  are  not  defined  except  with  respect  to  specified  levels  of  perform¬ 
ance  (or  complete  definition  of  failure  or  non  performance)  under  conditions  of 
environment  and  use  that  are  also  fully  characterized. 


5.2  Some  Dynamics  of  the  Terminal  Process 

It  is  often  convenient  to  identify  the  condition  that  corresponds  to  hardware 
failure  (or  the  end  of  useful  life)  as  irreversible  damage  due  to  some  form  of 
overstress.  The  transition  from  an  unfailed  to  a  failed  state  we  choose  to  call 
the  terminal  process.  This  may  occur  rapidly  with  the  application  of  an 
environmental  overstress  to  a  "good  as  new"  structure.  The  vulnerability  to 
rapid  catastrophic  failure  may  also  build  gradually  via  a  wearout  process  associated 
with  or  incidental  to  normal  use.  Corrosion  and  fatigue  have  already  been  compared 
and  contrasted  from  this  point  of  view.  In  addition  catastrophic  change  of  state 
or  loss  of  function  need  not  occur  at  all  for  an  item  to  be  declared  worn  out. 
Automobile  tires,  for  example,  are  ordinarily  replaced  in  response  to  cues  less 
dramatic  than  a  flat  or  a  blowout. 

It  is  not  our  purpose  here  to  engage  in  serious  failure  analysis.  We  shall 
avoid  attempting  to  review  specifically  what  can  go  wrong  with  the  devices  we 
build.  It  is  of  the  utmost  importance,  however,  to  recognize  that  equipment  can 
and  will  malfunction.  This  realization  together  with  the  motivation  it  stimulates 
to  pursue  sound  design  principles  and  constructive  maintenance  practices  may  be 
our  best  defense  against  unreliable  hardware.  Regarding  the  relative  importance 
of  these  two  features  (design  and  maintenance)  we  can  look  to  the  complicated 
organic  systems  (including  man)  found  in  nature.  Here  repair  is  as  dynamic  and 
highly  organized  as  creation  itself. 


5.2.1  Random  Hazard  Case 


We  have  looked  at  the  random  hazard  situation  from  a  reliability  point  of 
view.  Now  let  us  consider  this  case  from  a  perspective  emphasizing  stress/strength 
overlap  and  the  mechanism  of  failure.  The  stress/strength  overlap  description 
of  a  random  hazard  problem  is  static  in  the  sense  that  the  strength  distribution 
is  taken  to  be  fixed.  This  means,  of  course,  that  the  conditions  of  use  do  not 
physically  degrade  the  item  of  interest  (until  the  ultimate  catastrophic  failure 
occurs).  Also  we  are  concerning  ourselves  with  operation  under  uniform  environ¬ 
mental  conditions.  A  uniform  environment  is  not  one  that  does  not  exhibit  varia¬ 
tions.  Rather  it  is  statistically  repetitive  over  time  intervals  of  reasonable 
length.  Shooman  in  Chapter  8  of  Ref.  9  considers  the  application  of  stresses 
distributed  randomly  in  time  according  to  a  Poisson  distribution 
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to  a  part  having  a  static  strength  represented  by  the  distribution  f(S). 

Equation  ( 4ft )  gives  the  probability  that  n  stresses  will  occur  in  a  time  interval 
of  duration  t.  The  amplitudes  of  these  stresses  are  understood  to  be  distributed 
as  some  f(s).  Shooman”  calculates  the  probability  that  n  stresses  will  occur  and 
that  the  component  of  interest  will  survive  all  of  them.  Summing  over  all  n  yields 
the  overall  time  dependent  probability  of  success  or  component  reliability 

R(t)  =  e~Qvt,  (49) 

where  Q  is  the  static  unreliability  associated  with  a  single  probabilistic  stress/ 
strength  overlap  encounter.  Kececioglu  and  Cormier^  have  provided  the  analytical 
machinery  for  calculating  Q  and  its  complement  the  static,  single  stress  cycle 
reliability  R  =  1-Q  via  the  expressions 
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(50a,  50b) 

(51a,  51b) 


Equation  (49)  gives  the  time  dependent  survival  probability  of  a  component  exposed 
to  stresses  imposed  randomly  at  an  average  rate  of  v  per  unit  time  when  the 
probability  of  surviving  a  single  such  load  cycle  is  R-  =  1-Q.  Combining  F,q.  (49) 
with  F.qs.  (3c)  and  (4a)  the  time-to-f allure  probability  density  and  hazard  rate 
functions  are 
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The  case  we  are  dealing  with  assumes  that  the  static  unreliability  Q  is  constant. 

We  see  then  that  randomly  stressing  components  of  invariant  strength  does  in  fact 
correspond  to  the  constant  hazard  rate,  exponential  time-to-failure  pdf,  exponential 
reliability  situation. 

Further  insights  can  be  developed  from  this  model.  If  the  stress  environment 
is  altered  or  if  the  part  strength  distribution  is  modified  via  design,  materials, 
or  manufacturing  process  changes,  the  static  unreliability  Q  is  changed  to  a  new 
constant  value.  The  exponential  reliability  model  still  applies  but  with  a  new 
value  of  the  parameter  X.  This  is  the  basis  for  environmental,  quality,  derating, 
etc.  factors  employed  in  handbook  prediction.  If  all  applied  stresses  induce 
part  failure,  then  Q  =■  1  and  X  =  v.  The  model  continues  to  exhibit  exponentially 
distributed  times  to  failure  and  Poisson  distributed  failures  per  time  interval. 

This  is  a  well  known  relationship  between  these  two  distributions  (See  for  example 


27 


Appendix  10. A  of  Ref.  3.).  We  can  also  see  from  Eq.  (53)  that  the  hazard  rate  X 
may  be  reduced  by  decreasing  the  stress/strength  overlap  unreliability  Q.  A  plot 
of  a  typical  stress/strength  overlap  situation  is  shown  in  Fig.  14.  Again 
following  Ref.  10  the  unreliability  Q  is  given  via  Eq.  51a  as  the  area  under  the 
stress-at-failure  distribution  function 
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(54a,  54b,  54c) 


There  are  a  variety  of  situations  for  which  the  above  description  would  not 
be  adequate.  If  the  component  were  required  to  operate  in  statistically  distinct 
environments,  Q  would  not  be  constant.  Temperature  dependence  of  the  strength 
distribution  would  also  be  a  complicating  feature.  If  the  part  strength  distribu¬ 
tion  is  degraded  by  the  application  of  stress,  the  unreliability  Q  would  depend  on 
loading  history.  The  latter  case  represents  a  general  class  of  wearout  phenomena. 
We  consider  wearout  in  the  next  section  of  the  report. 


5.2.2  Wearout 


Wearout  refers  to  a  systematic  loss  of  functional  integrity  with  time.  This 
may  be  load  or  use  induced  as  in  the  case  of  fatigue.  Wearout  may  also  proceed 
Independently  of  loading  as  in  the  examples  of  corrosion  or  on-the-shelf  degrada¬ 
tion  of  unstable  chemicals.  The  characteristics  usually  emphasized  in  connection 
with  wearout  phenomena  are  a  strongly  peaked  time-to-failure  pdf  and  its  associated 
increasing  hazard  rate. 

The  stress/strength  overlap  description  of  the  random  hazard  situation  given 
in  the  previous  section  may  be  readily  generalized  to  include  wearout  phenomena. 

In  the  simplest  case  we  retain  the  feature  of  stresses  Poisson  distributed  in 
equal  time  intervals.  However,  the  stress/strength  overlap  unreliability  Q  (or 
per-loading-cycle  probability  of  component  failure)  Is  taken  to  be  time  dependent. 
Thus  Q  -*■  Q(t).  If  Q(t)  is  a  decreasing  function,  we  are  dealing  with  early 
failures  or  infant  mortality.  Wearout  is,  of  course,  described  by  an  increasing 
Q(t).  The  expressions  for  reliability,  time-to-failure  probability  density 
function,  and  hazard  rate  analogous  to  Eqs.  (49),  (52),  and  (53)  are 


R(t)  =  exp(-Q(t)vt]  , 


(55) 


and 


f(t)  =  v 


X(t)  =  v 


Q(t)  +  ~ )  J  exp  (-Q(t) vt)  , 

Q(t)  + 


(56) 


(57) 


In  a  wearout  situation  Q(t)  ranges  monotonically  from  a  low  initial  value  to  a 
maximum  of  unity.  Q(t)  “  1  represents  such  a  severe  strength  degradation  that 
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every  applied  stress  may  be  expected  to  induce  failure.  This  is  a  saturation 
situation  (0(t)  cannot  become  any  larger]  so  that  dQ(t)/dt  =  0.  In  this  case  of 
complete  wear out  we  see  that  the  hazard  rate  X  becomes  equal  to  v  the  parameter 
of  the  Poisson  distribution  of  applied  stresses.  From  the  close  relationship  of 
the  Poisson  and  exponential  distributions,  v  is  also  recognized  as  the  average 
rate  of  occurrence  of  the  random  applications  of  stress. 

Equations  (55)  through  (57)  relate  the  important  re liahi li ty/ li fe  functions 
to  the  stress/strength  overlap  parameter  Q(t)  and  the  average  frequency  of  stress 
occurrence  v.  In  a  still  more  general  context  the  rate  at  which  stress  applications 
occur  may  itself  be  time  dependent  so  that  v  becomes  v(t).  In  addition  the  time 
dependence  of  the  quantity  Q(t)  may  he  due  to  both  component  strength  degradation 
and  time  dependence  of  the  distribution  envelope  of  applied  stress.  The  latter 
description  applies  to  changes  in  service  environment  or  conditions  of  use.  Such 
a  situation  requires  further  generalization  of  Eqs.  (55)  through  (57).  This  is 
streamlined  by  introducing  a  new  parameter  that  contains  all  the  time  dependence 
of  the  problem,  let 


z ( t )  =  Q(t)v(t)t.  (58) 

In  terms  of  z(t)  the  reliability,  time-to-failure  pdf,  and  hazard  rate  functions 
take  the  simple  forms 

R(t)  =  e_z(t),  (59) 


f(t)  = 

(  dz (t )  i  -z(t) 

^  dt  J 

X(t)  = 

dz(t) 

dt 

(60) 

(61) 


We  are  now  in  a  position  to  further  compare  and  contrast  life  prediction  and 
reliability  prediction.  In  one  sense  the  two  approaches  are  totally  equivalent 
via  Eqs.  (la)  and  (3c).  Usually,  however,  reliability  prediction  refers  to 
drawing  conclusions  from  actual  time-to-failure  experience  with  components  or 
systems.  This  may  be  termed  a  macroscopic  approach  to  evaluating  the  functional 
forms  of  R(t),  f(t),  and  X(t)  directly.  Often  this  takes  the  form  of  using  the 
available  data  to  verify  that  some  model  such  as  the  exponential,  Weibull,  or  log 
normal  is  in  fact  appropriate.  For  the  Weibull  case,  for  example,  one  asserts 

z(t)  =  (— ~^)  and  adjusts  y,  6,  and  q  to  best  represent  the-  data.  In  implement¬ 
ing  the  life  prediction  approach  it  would  not  be  necessary  to  observe  actual 
failures  in  normal  field  service.  Rather  one  would  characterize  the  operating 
stress  environment  (by  ascertaining  both  a  frequency  profile  v(t)  and  an  ampli¬ 
tude  distribution  f(s)  representing  the  loading  situation).  A  study  to  determine 
the  strength  distribution  f(S)  of  the  item  of  interest  is  also  required.  Compli¬ 
cating  features  are  that  f(s)  and  f(S)  may  themselves  both  be  time  dependent 
(perhaps  implicitly  via  another  factor  such  as  temperature).  If  all  of  this 
information  is  obtainable,  Eqs.  (51)  and  (58)  may  be  used  to  evaluate  z(t).  The 
reliability,  life,  and  hazard  functions  are  then  found  via  Eqs.  (59),  (60),  and  (61). 
Which  of  these  two  approaches  is  the  more  tractable  one  is  a  decision  that  must 
be  made  for  each  problem  on  its  own  merits. 
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A  simplified  wearout  modeling  exercise  is  presented  in  Section  5.4.  Corrosion 
is  taken  to  be  the  operative  mechanism  and  the  distributed  character  of  important 
parameters  is  emphasized. 

5.2.3  Mixed  Populations 

In  reliability  work  it  is  common  to  generalize  and  classify  causes  of 
equipment  malfunction  as  early,  random,  or  wearout  failures.  Early  failures  are 
associated  with  hardware  which  was  not  delivered  in  satisfactory  condition  to 
hegln  with.  Manufacturi ng  defects  or  damage  during  inspection  or  shipment 
resulting  in  premature  loss  of  function  are  examples  of  causes  of  early  failures. 

A  decreasing  hazard  rate,  an  initially  very  large  t ime-to-failure  pdf,  and  a  sharp¬ 
ly  decreasing  (initially)  reliability  function  are  associated  with  early  failures. 
The  random  hazard  situation  has  been  discussed  previously  and  refers  to  the  chance 
occurrence  of  stresses  large  enough  to  induce  failures  in  components  of  normal 
(ordinarily  adequate)  strength.  For  example  a  random  failure  of  an  automobile 
tire  might  be  induced  by  impact  with  a  foreign  object  on  the  roadway.  Such  an 
occurrence  is  random  because  there  is  nothing  about  it  favoring  one  time  interval 
over  another  (of  equal  duration).  Random  failures  exhibit  a  constant  hazard 
rate,  exponential  reliability  attrition,  and  exponentially  distributed  times  to 
failure.  Wearout  failures,  on  the  other  hand,  are  those  that  occur  because 
component  strength  has  eroded  to  the  point  of  not  being  able  to  withstand  the 
stresses  of  normal  service.  In  this  case  the  situation  further  deteriorates  as 
time  passes.  The  hazard  rate  is  an  increasing  function,  the  time-to-failure  pdf 
is  peaked,  and  the  reliability  function  is  high  at  first  and  then  falls  sharply. 

When  a  group  of  hardware  items  is  subject  to  more  than  one  of  the  above 
failure  modes  simultaneously  it  is  termed  a  mixed  population.  For  example  a 
shipment  may  contain  some  initially  defective  as  well  as  some  normal  units.  This 
group  would  be  expected  to  exhibit  early  as  well  as  random  or  wearout  failures. 

It  is  also  not  uncommon  for  random  and  wearout  failures  to  be  intermingled.  An 
item  can  be  characterized  by  a  vulnerability  to  random  overstress  while  processes 
are  underway  to  erode  the  distribution  of  strengths  from  initial  nominal  values. 

In  this  case  the  same  group  of  components  would  exhibit  random  and  wearout  behavior 
at  the  same  time.  More  generally  in  a  system  certain  components  may  show 
predominately  random  hazard  behavior  while  others  fail  due  to  wearout.  In  either 
case  a  proper  reliability/life  description  involves  dealing  with  both  aspects 
simultaneously.  Early  failures  may  have  to  be  treated  also  although  at  the  mature 
system  level  one  prefers  to  have  weeded  out  this  category  via  some  form  of  testing 
or  burn-in  procedure. 

Formally  dealing  with  mixed  populations  is  straightforward  enough  although 
there  are  practical  difficulties  of  course.  The  different  aspects  of  the  problem 
are  taken  to  be  independent  so  that  the  overall  reliability  is  simply  the  product 
of  the  reliabilities  of  the  relevant  subclasses.  If  one  proceeds  from  the  life 
prediction  point  of  view  and  these  functions  have  been  characterized,  there  is  no 
problem.  The  situation  is  more  complicated,  if  the  reliability  viewpoint  is 
taken  to  interpret  time-to-failure  data.  It  must  be  recognized  that  no  single 
familiar  distributional  model  applies  to  this  situation.  In  principle  if  the 
form  of  the  superposition  is  known,  curve  fitting  may  be  employed  to  fix  the 
values  of  the  relevant  parameters.  This  approach  is  tenuous  because  the  statis- 
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ticnl  quality  of  reliability  data  is  not  usually  sufficient  to  permit  good  sep¬ 
aration  of  several  factors  contributing  to  the  detailed  description  of  a  subtle 
superposition  prohlem. 


5.3  Implementation  Phi losophy 

We  have  touched  on  the  duality  of  reliability  and  service  life  concepts.  One 
description  implies  the  other.  For  the  sake  of  establishing  a  consistent  use  of 
the  nomenclature  the  following  posture  has  been  adopted  in  this  report:  Reliability 
prediction  or  the  reliability  viewpoint  refers  to  interpreting  actual  equipment 
failures  under  actual  use  conditions.  This  would  include  time-to-failur e  data  or 
numbers  of  failures  in  established  intervals,  for  example.  This  is  termed  a 
macroscopic  approach  since  it  treats  phenomenologically  only  observed  failures. 

In  contrast  life  prediction  or  the  expected  service  life  approach  is  a  microscopic 
method  that  exmines  in  detail  the  conditions  that  cause  failures  to  occur. 
Measurements  are  taken  to  estahlish  a  statistical  description  of  the  strength  (in 
a  generalized  sense  referring  to  resistance  to  a  variety  of  types  of  stress)  of 
the  components  of  interest.  The  loading  or  stress  environment  must  also  be  fully 
characterized.  This  means  determining  the  amplitude  and  time  spectral  features 
of  applied  loads.  In  all  hut  the  most  restrictive  laboratory  settings,  this 
becomes  a  task  of  great  complexity.  The  method  is  therefore  appealing  when  the 
reliability/life  problem  can  be  reduced  to  perhaps  a  single  aspect  of  particular 
concern. 

There  is  another  area  of  activity  usually  called  accelerated  life  testing 
that  overlaps  the  two  approaches  discussed  ahove.  Thus  conclusions  are  drawn 
from  t ime-to-f ailure  data  as  in  the  reliability  approach.  But  this  information 
is  developed  in  a  compressed  time  domain  by  manipulating  (increasing  the  severity) 
of  the  applied  stresses.  Interpreting  accelerated  testing  requires  a  detailed 
correlation  of  the  overstress  situation  utilized  with  the  nominal  stress  conditions 
of  primary  interest.  This  implies  a  simultaneous  understanding  of  the  problem 
from  the  stress/strength  overlap  viewpoint.  There  is  a  substantial  literature 
dealing  with  accelerated  testing.  Chapter  9  of  Ref.  11  is  a  good  point  of 
departure.  However,  further  discussion  of  this  topic  is  outside  the  scope  of 
this  report.  As  one  can  begin  to  see  the  reliability  problem  is  staggering  in 
scope.  Occasionally  an  Impasse  will  be  reached  which  can  be  resolved  by  failure 
analysis.  For  example  a  mixed  population  situation  may  lead  to  data  not  described 
by  a  single  model.  Examining  the  physical  character  of  each  failure  may  allow 
decomposition  into  subclasses  that  are  more  easily  interpreted. 

5. A  Dispersion  Effects — An  Example 

In  this  section  an  example  time-dependent  stress/strength  overlap  calculation 
is  presented.  Its  discussion  under  this  heading  has  to  do  with  the  important 
effects  of  distributional  properties  (in  this  case  of  the  corrosion  process  taken 
to  be  inducing  wearout).  The  problem  models  the  strength  attrition  of  a  cylindrical 
load  bearing  member  under  the  influence  of  a  corrosion  process  that  decreases  its 
radius  at  a  constant  average  (but  distributed)  rate.  The  process  wearout  endpoint 
or  strength  service  limit  is  taken  to  be  an  arbitrary  constant  value.  This  is 
equivalent  to  dealing  with  the  situation  where  the  reliability  description  is 
developed  using  a  failure  governing  stress  regarded  as  a  deterministic  (disper- 
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sionless)  constant.  Adopting  this  approach  allows  the  interesting  reliability 
insights  to  be  developed  at  a  minimum  cost  in  terms  of  computational  complexity. 


Our  first  concern  in  this  corrosion  modeling  problem  is  to  examine  the  dy¬ 
namics  of  a  migrating,  spreading  strength  distribution  function  f(S,t)  sweeping 
across  a  defined  endpoint  S'  as  shown  in  Fig.  15.  A  general  treatment  of  this 
leads  to  a  formal  expression  for  the  time-to-f ailure  probability  density  func¬ 
tion.  The  strength  distribution  is  taken  to  be  normal  with  time  dependent  mean 
and  standard  deviation.  Thus 


f(S,t)  = 


«'/2tT ag  (l) 


exp  -  y 


i  r s  -  us(t)  r" 


°s(t) 


The  unreliability  of  a  single  unit  or  the  worn  out  fraction  of  a  population  of 
similar  units  equals  the  area  U(S’,t)  under  the  strength  distribution  to  the  left 


Therefore,  the  worn  out  fraction  is 


-  t)  .  f 


f(S,t)dS  = 


S’-Pg(t) 

0s(t) 


e  dd 


where  the  change  of  variable  i>  =  (S-pg) /ctg  has  been  introduced.  Differentiation 
of  U(S',t)  under  the  integral  sign  (which  is  simplified  by  the  variable  transfor¬ 
mation  that  places  all  the  explicit  time  dependence  in  the  upper  integration 
limit)  yields  an  expression  for  the  corrosion  wearout  time-to-f ailure  probability 
density  function  fw(S!,t).  This  result  is 


fw(s',t)  - 


dU(S'.t) 

dt 


=  |exp(-(((>,)2/2)j  l^jrj  ,  -(64) 


where  <J> '  =  <j>(S').  Taking  the  indicated  derivative  of  $ '  we  can  express  Eq.  (64) 
in  terms  of  the  time  dependent  parameters  qg(t)  anc*  °S^t^  °f  the  strength 
distribution  f(S,t)  directly  as 


fw(s',t)  = 


The  required  normalization  !  fw(S',t)dt  =  1  is  apparent  from  inspection  of  Eq.  (64), 

j  —  CO 

To  carry  the  corrosion  modeling  beyond  the  initial  formal  stages  one  needs 
to  display  the  time  dependence  of  the  strength  distribution  explicitly.  This  is 
done  by  distribution  synthesis  under  the  assumptions  that  a  linear  corrosion 
process  operates  to  decrease  the  effective  radius  (and  therefore  load-bearing 
section)  of  an  axially  symmetric  strength  member.  The  load  bearing  capability 
(or  strength)  is  given  by  the  product  of  the  material  tensile  strength  and  the 
remaining  sectional  area.  Letting  T  represent  the  tensile  strength  and  rQ,  c, 
and  t  the  initial  radius,  the  corrosion  rate,  and  time  respectively;  the  strength 
function  is 


TtT(r  -  ct)  . 
o 
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For  our  present  illustrative  purpose  r0  and  c  are  taken  to  be  normally  distributed 
while  T  and  t  are  regarded  as  deterministic  parameters.  Standard  distribution 
synthesis  arguments  then  lead  to  the  desired  expressions  pg(t)  and  og(t)  developed 
in  Appendix  D  and  displayed  as  F.qs.  (67)  and  (68). 

p  (t)  =  ttT[(u  -  u  t)1  +o  +  o  2t  )  (67) 

S  r  c  r  c  ^ 

o  o 


o  (t)  =  nTf4(p  -p  t)2(o  ■  +  a2t?)  +  2(a  +o2tn)2li  (68) 

S  ^  r  c  r  c  t  c  ' 

o  o  o 

Taking  T  and  t  to  be  distributed  would  complicate  these  expressions  but  not 
particularly  enhance  the  insights  being  developed  in  this  modeling  exercise. 

At  this  point  we  have  obtained  a  general  expression  for  the  corrosion  wearout 
t ime-to-f ailure  pdf  and  displayed  explicitly  the  time  dependence  of  the  parameters 
appearing  therein.  The  next  steps  are  to  take  the  required  time  derivatives, 
simplify  the  notation  a  bit  by  introducing  auxiliary  parameters,  and  calculate  a 
representative  group  of  numerical  results  for  graphical  display  purposes.  It  is 
convenient  to  work  in  terms  of  the  fractional  strength  S  defined  as 


S(t) 


S(t) 

MS(0) 


S(t) 

mT(pr?  + 


(69) 


which  is  distributed  with  parameters  p^  =  pg/yg(0)  and  =  Og/og(0).  Some 
additional  rescaling  which  complicates  the  notation  slightly  but  simplifies  the 
arithmetic  is  also  implemented.  A  decomposition  of  the  problem  appropriate  for 
numerical  evaluation  is  included  as  Table  IV.  In  Fig.  16  a  typical  time-to- 
failure  distribution  function  is  plotted  as  a  function  of  the  standardized  time 
variable  z  (real  time  reexpressed  on  an  initial  mean  radius  divided  by  nean 
corrosion  rate  basis).  The  time-to-f ailure  pdf  is  seen  to  be  skewed  to  the  right. 
Plotting  the  abscissa  of  this  function  on  a  logarithmic  scale  almost  perfectly 
symmetrizes  the  distribution  as  shown  in  Fig.  17.  Thus,  the  log  normal  distribution 
is  an  excellent  representation  of  the  results  of  this  corrosion  modeling  exercise. 
The  time-to-f ailure  distribution  was  numerically  integrated  to  form  its  cumulative. 
This  allowed  the  corrosion  wearout  reliability  function  to  be  calculated  for  the 
model  as  unity  minus  the  cumulative  failure  function.  This  function  is  compared 
in  Fig.  18  with  an  exponential  (random  hazard)  reliability  function  having  the 
same  MTBF .  Additional  graphs  are  presented  representing  the  effects  of  different 
process  endpoint  choices  (S')  and  corrosion  rate  dispersions  (oc/pc)  on  the  time- 
to-failure  and  reliability  functions  in  Figs.  19  and  20.  In  the  cases  examined 
endpoint  choice  has  a  more  pronounced  effect  than  variability  of  the  effective 
corrosion  rate.  One  has  to  temper  any  conclusion  drawn,  however,  with  the  observa¬ 
tion  that  this  is  a  Gedanken  experiment  and  does  not  yet  represent  empirical  inputs. 


The  corrosion  modeling  exercise  has  an  immediate  qualitative  appeal  for  two 
reasons.  The  time-to-failure  distribution  is  skewed  to  the  right — exactly  the 
result  one  expects  of  a  process  terminated  as  a  spreading,  symmetric  distribution 
of  strengths  sweeps  through  a  sharply  defined  endpoint.  Also,  the  time-to-failure 
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pdf  seems  well  represented  by  a  log  normal  distribution  as  is  often  also  the  case 
for  empirical  corrosion  studies.  The  assumptions  leading  to  these  results  are 
quite  simple — a  crisp  definition  of  end  of  useful  life,  initial  part  radii  normally 
distributed,  and  clustered  corrosion  rates  also  taken  to  be  properly  represented 
by  a  normal  distribution.  For  the  examples  considered  one  can  summarize  by  noting 
that  the  time-to-failure  distributions  are  strongly  peaked  and  weakly  skewed  to 
the  right.  In  a  real  experiment,  structured  along  the  lines  of  the  Cedanken 
experiment  just  considered,  extiihiting  similar  dispersion  of  the  corrosion  rate, 
and  yielding  limited  time-to-failure  stochastic  data,  one  might  be  hard  put  to 
prefer  a  skewed  description.  This  is  an  argument  for  paying  close  attention  to 
the  statistical  design  of  reliability  evaluation  experiments. 


5 . 5  Advantages,  Limitations,  and  Dif f iculties 

What  we  have  chosen  in  this  report  to  call  life  prediction  is  essentially 
simply  the  application  of  stress/strength  overlap  methods  to  hardware  serviceability 
problems.  With  only  the  slightest  change  in  viewpoint  one  would  (perhaps  more 
conventionally)  call  this  probabilistic  design  for  reliability.  In  any  event,  it 
is  the  microscopic  approach  that  we  have  referred  to  involving  a  description  in 
terms  of  the  stresses  operative  in  a  given  situation  and  the  ability  of  hardware 
to  function  under  specified  loading  conditions.  The  method  is  a  very  powerful 
and  definitive  one  provided  the  detailed  data  requirements  can  be  met.  Full 
distributional  information  relating  to  load  stresses  and  component  strengths  is 
needed.  Clearly  the  scale  of  the  problem  for  systems  of  even  moderate  complexity 
preempts  the  use  of  such  a  detailed  approach.  Life  prediction  can  be  most 
beneficially  implemented  when  just  one  or  a  few  areas  of  particular  concern  can 
be  isolated.  F.ven  these  situations  will  often  require  separate  studies  to 
characterize  the  failure  governing  stress  and  strength  distributions.  Hopefully 
this  stiuation  will  improve  if  pract ic ione rs  heed  the  appeal  of  Kececioglu  and 
Cormier1"  to  publish  distributional  data.  Kapur1^  has  observed  that  the  problem 
can  be  simplified  somewhat  by  concerning  oneself  only  with  the  overlapping  tails 
of  typical  stress/strength  distributions.  It  is  these  regions  that  dominate  the 
probabilistic  unreliability. 

In  the  sonar  setting  it  may  not  be  practical  to  attempt  a  full  probabilistic 
description  of  the  thermal,  chemical,  vibrational,  and  shock  loading  aspects  of 
the  exposed  shipboard  environment.  Focusing  more  narrowly  on  suspected  troublesome 
areas  such  as  the  processes  that  threaten  housing  integrity  might  prove  both 
tractable  and  beneficial,  however.  A  probabilistic  approach  has  a  particular 
appeal  in  that  it  represents  a  description  emphasizing  the  distributional  aspects 
of  the  reliability/life  problem.  Even  when  dealing  with  rather  basic  structural 
materials,  Bondi1^  has  pointed  out  that  it  is  precisely  the  dispersion  of  their 
physical  properties  that  strongly  impacts  their  usefulness. 

Probabilistic  design  methods  have  been  under  development  on  a  rather  broad 
front  for  twenty  years  or  so.  Very  little  of  this  pretentious  structure  has  been 
displayed  in  this  report.  Thus  the  reader  is  cautioned  not  to  underestimate 
either  the  labor  or  potential  benefits  of  addressing  sonar  hardware  problems 
probabilistically.  Input  information  is  the  key.  Given  distributional  data 
calculation  cf  stress/strength  overlap  has  been  reduced  to  quadrature.  Perhaps 
the  most  powerful  numerical  approach  is  Monte  Carlo  simulation.  Computer  programs 
for  this  have  been  developed. 


6.0 


RELIABILITY /LI FF  DEMONSTRATION 


Interest  in  product  reliability  is  multifarious.  Common  areas  of  concern 
are  defining  realistic  goals,  upgrading  an  engineering  design,  reducing  equipment 
life-cycle  costs,  coping  with  state-of-the-art  performance,  and  evaluating 
deliverable  equipment.  Reliability  prediction  is  a  tool  in  the  hardware  improvement 
process.  But  when  it  cones  to  obtaining  the  most  definitive  kind  of  statement 
about  the  progress  made,  some  form  of  demonstration  test  is  required.  A  general 
but  not  too  detailed  discussion  of  several  aspects  of  reliability/life  evaluation 
concerns  follows. 


6.1  Preferred  Kinds  of  Information 


Equipments  can  be  evaluated  in  a  number  of  ways.  But  ordinarily  the  primary 
concern  is  how  well  or  for  how  long  will  the  hardware  meet  mission  performance 
objectives  under  the  conditions  of  use  intended.  In  one  sense  the  most  definitive 
answer  to  this  kind  of  question  comes  from  accumulating  actual  field  service 
experience.  Often  such  information  is  unavailable  and  it  is  rarely  timely  for 
making  procurement  decisions.  Next  best  with  respect  to  determining  the  reliability 
properties  of  interest  is  laboratory  testing  designed  to  substantially  reproduce 
field  conditions.  Here  timeliness  may  (or  may  not)  be  improved  and  questions  of 
cost  have  to  be  addressed  for  highly  reliable  equipment.  Applied  stresses  may  be 
increased  to  decrease  test  time  or  the  number  of  units  required  to  be  set  apart 
for  evaluation  purposes.  This  is  accelerated  testing  and  of  course  is  itself  not 
without  confounding  features.  Thus  full  interpretation  requires  one  to  relate 
real-time  and  accelerated  test  results  and  live  with  the  uncertainties  of  the 
description.  Against  this  background  of  complications  some  fairly  efficient 
means  of  evaluating  performance  and  accepting  or  rejecting  production  lots  have 
been  developed.  The  methods  are  termed  parametric  or  non-parametr ic  depending  on 
whether  or  not  a  description  is  developed  in  terms  of  (the  parameters  of)  a 
character izable  distribution  function.  Some  examples  are  discussed  in  the 
following  sections. 

Regardless  of  what  approach  best  suits  a  particular  application,  from  a 
reliability  point  of  view  we  will  ask  whether  the  item  of  interest  is  still 
functional.  Is  it  failed  or  unfailed  after  some  period  of  operation?  In 
characterizing  life  we  will  need  to  know  when  the  failure  occurred  relative  to 
when  operation  of  the  equipment  began.  Obviously,  then,  it  will  be  necessary  to 
decide  specifically  what  excursions  from  nominal  performance  are  to  be  considered 
tolerable  and  which  constitute  failure.  If  time-to-f ailure  information  is  to  be 
acquired,  monitoring  procedures  having  an  appropriate  time  resolution  must  be 
implemented.  Some  cautions  are  In  order.  Test  conditions  must  be  similar  to  use 
conditions  if  results  are  to  be  applied  directly  to  equipment  to  be  placed  in 
service.  Also  it  is  important  that  test  equipments  be  similar  (meaning  as  closely 
alike  as  manufacturing  procedures  allow)  to  hardware  intended  for  field  use  to 
allow  valid  inferences  to  be  made.  To  achieve  the  latter  it  is  desirable  to 
implement  some  scheme  to  select  an  unbiased  test  sample  from  a  larger  homogeneous 
production  lot. 

Summarizing  the  essential  points,  tirae-to-failure  information  is  the  most 
useful  type  while  total  failures  in  an  interval  represents  a  class  also  of 
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importance.  The  former  are  required  for  the  construction 
reliability/life  descriptions.  Total  failure  data  permit 
point  reliability  estimate  (itself  distributed  as  we  have 
no  light  on  time-to-f ailure  distributions  of  more  than  one 
liability  description  fleshed  out  in  terms  of  producer  and 
lly  considered  adequate  for  acceptance  testing.  A  distribu¬ 
te  be  preferred  for  failure  mode  diagnostics,  prediction 
comparison  with  probabilistic  design  or  physics-of-failure 


Inferring  Distribution  Parameters 

Often  one  wishes  to  describe  a  reliability  problem  in  terms  of  a  mathematical 
model.  The  properties  of  a  few  important  models  were  discussed  earlier  in  the 
report.  It  is  desirable  that  a  particular  model  be  advanced  on  the  basis  of 
physical  arguments.  Whether  in  practice  a  model  is  introduced  systematically  in 
this  way  or  on  a  more  pragmatic  basis,  a  modeling  exercise  ultimately  involves 
choosing  the  model  parameters  that  best  represent  the  available  data.  Several 
methods  for  this  such  as  matching  moments,  probability  paper  plotting,  or  standard 
regression  analysis  are  available.  The  latter  two  approaches  are  discussed 
further  in  the  next  two  sections. 


^•2.1  Plotting  Methods 


In  esta  ilishing  or  making  use  of  correlations  between  experimental  data 
and  the  parameters  of  an  associated  physical  model,  it  is  often  convenient 
to  introduce  variable  transformations  to  linearize  the  model.  The  model  is  con¬ 
firmed  if  the  transformed  data  plot  as  a  straight  line  on  linear  coordinate 
paper.  Fitting  the  best  straight  line  to  the  data  can  be  done  visually  (avoiding 
numerical  regression  analysis).  Furthermore  model  parameters  can  be  inferred 
from  the  slope  and  intercept  of  the  best-fit  line  making  use  of  all  the  data  at 
once.  This  gives  roughly  equivalent  results  and  is  more  efficient  than  statis¬ 
tically  processing  the  various  point  estimates  separately. 

Linear  curve  fitting  can  be  further  streamlined  from  the  data  analysts' 
point  of  view.  To  do  this  one  builds  the  necessary  mathematical  rescaling 
directly  Into  the  coordinate  axes  of  the  graphical  display.  For  example,  con¬ 
sider  the  two-parameter  exponential  reliability  model: 


R  =  e-X(t-Y)  . 

Taking  natural  logarithms  yields 

In  R  =  -At  +\y  . 


Or  equivalently 


In  ( 1 / R)  =  It  — Xy„ 


(70) 


(71) 


(72) 


Equations  (71)  and  (72)  are  linear  in  the  standard  slope-intercept  form  y  =  mx  +  b 
with  independent  variable  t  and  dependent  variable  InR  or  ln(l/R)  =  -InR.  Thus 
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Ink  plots  linearly  on  linear  coordinate  paper  and  R  itself  plots  as  a  straight 
line  on  standard  semi-logarithmic  paper.  There  remain  some  provisos  associated 
with  the  use  of  t'qs.  (71)  or  (72).  The  reliability  R  is  not  a  directly  observable 
quantity  but  must  be  estimated  on  the  basis  of  observed  failures  among  similar 
equipments.  We  will  digress  to  explore  a  preferred  approach  for  this. 

Relialilitv  data  requirements  are  discussed  in  Section  8.3.  Anticipating 
the  character  of  that  discussion,  imagine  that  we  are  blessed  with  a  set  of  time- 
to-f allure  data  to  be  analyzed.  Times  to  failure  may  be  grouped  to  generate  a 
frequency  histogram  or  ranked  to  synthesize  a  representation  of  the  cumulative 
t ime— t o- f a ilur e  di str ibut ion  of  the  population  from  which  the  sample  was  drawn. 
Johnson14  has  pointed  out  difficulties  in  inferring  distributional  properties 
using  the  former  approach  such  as  sensitivity  to  class  interval  choice  when 
dealing  with  small  samples.  Further  in  a  very  pretty  logical  exposition1  he 
has  developed  the  median  rank  method  of  organizing  ordered  failure  data.  This 
approach  has  much  to  recommend  i t  in  connection  with  best  characterizing  the 
cummulative  distribution  representing  the  parent  population. 

Briefly  paraphrasing  some  of  Johnson's14  introductory  discussion  we  observe 
the  following:  Consider  that  a  sample  of  N  units  has  been  selected  (presumably 
randomly)  from  a  larger  population  and  tested  to  failure.  The  prerequisite  for 
constructing  a  cumulative  plot  is  to  appropriately  rank  each  failure.  Thus,  if 
the  entire  population  were  tested,  each  of  the  N  subset  failures  would  have  a 
definite  fraction  of  the  population  failing  earlier.  Correct  specification  of 
this  quantity  for  a  given  observed  failure  would  be  its  true  rank  within  the 
overall  failure  distribution.  Since  the  true  rank  is  ordinarily  unknown,  the 
best  we  can  do  is  estimate  it.  The  estimate  that  has  equal  probabilities  of 
being  too  high  and  too  low  is  called  the  median  rank.  Johnson14  shows  that  the 
true  ranks  of  ordered  failures  within  a  subpopulation  are  beta  distributed  and 
that  the  median  rank  MR  of  the  jth  failure  among  N  samples  tested  is  obtained 
from  the  cumulative  binomial  distribution  (partial  binomial  sum)  via 

Y  kT("Ik )!~  (MR)N~k  (1-MR)k  =  Y  »  0  <  MR  <  1  •  (73) 

Equation  (73)  is  of  order  N  in  MR  but  has  uniquely  one  root  in  the  interval  0  to 
1.  Tables  of  median  ranks  as  well  as  other  percentile  ranks  have  been  prepared 
by  J.  S.  White  and  incorporated  in  Ref.  2.  Some  rank  distributions  and  their 
median  ranks  are  displayed  in  Fig.  21  for  a  sample  of  10  units. 

Let  us  return  to  the  construction  of  probability  plotting  paper.  Conven¬ 
tionally  and  for  the  reasons  discussed  some  function  of  the  cumulative  time-to- 
failure  distribution  is  plotted  as  a  function  of  time,  rescaled  to  yield  a  linear 
description.  These  plots  are  arranged  to  have  positive  slope.  Consider  the 
Weibull  reliability  function  as  a  starting  point  for  example 


Taking  reciprocals  and  then  taking  natural  logarithms  twice  yields 

In  In  ( 1/R)  =  Bln(t-y)  -Sinn.  (75) 
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Or  since  R  =  1  -  U  and  the  median  rank  MR  is  a  preferred  estimate  of  the 
cumulative  failure  function  U,  Eq.  (75)  becomes 


In  In 


1  I 
1  -  MR 


8 (ln( t-y))  -  (einn)  . 


(76) 


Equation  (76)  is  linear  (In  the  rescaled  quantities)  in  slope-intercept  form. 
Weibull  probability  paper  is  constructed  by  building  the  required  scaling  into 
the  coordinate  labeling  so  that  MR  plotted  versus  t-y  yields  a  straight  line 
directly. 


A  variety  of  Weibull  papers  as  well  as  probability  papers  for  other 
distributions  are  obtainable.  Ford  Motor  Company  and  General  Motors  have 
both  developed  Weibull  papers  for  internal  use.  Probability  plotting  papers 
are  commercially  available  from  a  company  that  identifies  itself  by  the 
acronym  TEAM  (Technical  and  Engineering  Aids  for  Management).  A  catalog 
of  their  special  purpose  graph  papers  is  available  on  request.  Contact 

TEAM 

P.  0.  Box  25 
Tamworth,  N.H.  03886 
Telephone:  (603)323-8843. 

More  detailed  descriptions  of  the  use  of  probability  paper  are  given  in 
References  2,  15,  and  16.  The  latter  document  discusses  papers  developed  by 
R.  A.  Evans.  Most  probability  papers  label  the  ordinate  axis  as  "percent  failed" 
or  "percent  failure"  while  observed  failure  times  are  plotted  as  abscissas.  The 
median  rank  has  been  discussed  as  a  preferred  estimate  of  the  percent  failed  and 
thus  can  be  used  directly  in  probability  plotting.  Evans^  recommends  the  es¬ 
sentially  equivalent  approach  of  plotting  each  datum  twice  at  ordinates  r/n  and 
(r-l)/n  where  the  notation  refers  to  the  r1^  ordered  failure  among  n  items 
tested.  These  two  points  fall  on  either  side  of  the  corresponding  median  rank 
and,  of  course,  are  easy  to  calculate.  Both  methods  are  applicable  even  if 
failures  do  not  occur  for  all  n  items  tested  (censored  test). 

Some  facsimile  time-to-f allure  data  including  the  specification  of  median 
ranks  is  displayed  as  Table  V.  This  information  is  shown  plotted  on  Weibull 
probability  paper  in  Fig.  22.  Three  curves  are  shown  representing  different 
choices  of  the  position  parameter  y.  Fixing  y  is  an  iterative  procedure.  If 
no  position  parameter  can  be  found  which  linearizes  (approximately — data  are 
usually  scattered)  the  Weibull  plot,  one  concludes  that  the  t imes-to-failure 
are  not  Weibull  distributed.  If  a  linear  Weibull  plot  is  obtained,  the  shape 
and  scale  parameters  8  and  n  are  found  via  simple  graphical  procedures  that 
vary  slightly  depending  on  the  particular  paper  employed.  Discussion  of  the 
uncertainties  to  be  associated  with  the  parameter  values  obtained  via  probability 
plotting  is  deferred  to  Section  6.3. 


6.2.2  Curve  Fitting 

In  the  previous  section  we  have  considered  inferring  distribution  parameters 
using  probability  plotting  and  visual  curve  fitting.  This  is  a  convenient  and 
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widely  accepted  approach  to  standard  regression  analysis.  When  appropriate, 
regression  analysis  can  be  carried  out  with  greater  precision  using  numerical 
methods.  We  will  explore  this  avenue  here  for  utilitarian  reasoas  as  well  as 
to  develop  further  insights  in  the  reliabi 1 1 ty / 1 i fe  context.  Numerical  regression 
analysis  is  commonly  called  curve  fitting  or  least-squares  curve  fitting  and 
is  widely  discussed  in  texts  dealing  with  applied  statistics.  The  specific 
approach  to  the  subject  that  we  will  follow  is  developed  by  Bevington'^. 


Suppose  that  we  are  dealing  with  an  experimental  si  tint  ion  where  a  dependent 
variable  y  is  linearly  related  to  an  independent  variable  x  via 


y(x)  =  aQ  +  bQx 


(77) 


The  quantities  an  and  bQ  are  the  true  (but  unknown)  parameters  of  the  linear 
model  which  we  would  like  to  estimate  from  a  set  of  paired  observations  (xj.,yj). 
To  make  the  example  specific  let  us  assume  that  very  accurate  observations  of 
the  Xf  are  available  while  the  y^  are  normally  distributed  with  standard  devia¬ 
tions  oi  about  the  true  (but  unknown)  values  y(x^).  The  probability  that  the 
i f ^  measurement  will  yield  a  value  y^  is  then 
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Provided  the  y^  are  independent  the  probability  of  making  an  entire  set  of  N 
observations  of  y^  at  different  x^  is  given  by  a  product  of  N  factors  of  the 
form  of  Eq.  (78). 
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We  cannot  actually  evaluate  Eq.  (79)  because  we  do  not  know  the  true  linear 
model  parameters  aQ  and  bQ .  However,  we  can  rewrite  Eq.  (77)  in  terms  of 
estimates  a  and  b  of  these  quantities  as 

y (x)  =  a  +  bx  -  (80) 

Using  Eq.  (80)  in  Eqs.  (78)  and  (79)  yields  the  probability  that  the  set  of  N 
observations  is  associated  with  the  estimated  values  of  the  coefficients  a  and  b. 
Thus 

P(a,b)  = 


The  principle  of  maximum  likelihood  asserts  that  the  set  of  measurements 
actually  obtained  experimentally  is  more  likely  to  belong  to  the  true  parent 
distribution  than  to  any  similar  distribution  with  different  coefficients. 

Thus  the  best  estimates  of  the  parent  parameters  are  obtained  by  maximizing  the 
probability  given  in  Eq.  (81).  This  is  accomplished  by  minimizing  the  argument 
of  the  exponential,  or  equivalently  the  function 
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2.  Multiple  values  yj  observed  at  the  same  are  normally  distributed. 

3.  The  y^  observed  at  different  x^  are  independent  (uncorrelated). 

4.  The  regression  line  is  centered  on  the  mean  of  each  normally  distri¬ 
buted  yj. 

5.  The  statistical  weights  l/0j  of  each  observation  need  to  be  specified. 

6.  The  maximum  likelihood  concept  is  reasonable. 

These  statements  are,  in  fact,  appropriate  in  a  variety  of  measurement  situations. 
The  approach  can  be  made  more  general  or  more  specific.  For  example  if  the  x^ 
are  themselves  distributed,  their  uncertainties  can  be  reflected  into  the  y- 
coordinates  through  the  slope  of  the  regression  line  and  appropriately  taken 
into  account.  A  simplification  often  occurs  if  all  data  are  taken  using  the 
same  instrument  in  the  same  way.  In  this  case,  the  standard  deviations  at  may 
all  be  the  same  so  that  all  data  points  are  given  the  same  statistical  weight. 

Against  the  properties  of  common  forms  of  least-squares  fitting  let  us 
look  back  at  the  nature  of  the  probability  plotting  method  discussed  in 
Section  6.2.1.  We  might  comment  on  each  of  the  above  assumptions  individually: 

1.  In  probability  plotting  the  independent  parameter  is  time  directly 
or  some  function  of  time  (such  as  lnt).  It  is  treated  as  disper¬ 
sionless  althot  gh  In  any  given  experimental  realization  one  must 
specify  the  precision  of  the  time-to-f allure  information  developed. 

2.  In  probability  plotting  the  ordinate  values  are  typically  ln(l-MR)-! 
or  lnln( 1-MR)-^ .  As  we  have  seen  true  ranks  are  beta  distributed. 

As  a  practical  matter  the  above  functions  may  be  roughly  normally 
distributed  but  they  are  not  expected  to  be  rigorously  normally 
distributed . 

3.  Median  ranks  are  preferred  estimators  of  order  statistics  in  random 
samples  from  a  uniform  distribution.  These  order  statistics  are 
not  independent  (see,  for  example.  Section  2.40  of  Ref.  17).  Thus 
the  ordinate  values  used  in  probability  plotting  are  not  uncorrelated. 
This  is  probably  the  most  serious  obstacle  to  the  realization  of  a 
straightforward  and  satisfying  interpretation  of  t ime-to-fa ilure 

da  ta . 

4.  In  probability  plotting  we  are  dealing  with  a  visual  fit  to  the  data. 

We  are  not  in  a  position  to  comment  very  specifically  on  how  the 
ordinates  are  distributed  or  the  regression  line  is  positioned. 

5.  The  median  ranks  used  in  probability  plotting  represent  different 
beta  distributions.  A  statistical  weight  reflecting  ''he  differing 
dispersions  of  different  ordered  failures  should  be  constructed. 

No  such  adjustment  is  made  in  ordinary  probability  plotting.  Another 
shortcoming  is  that  the  functional  rescaling  employed  in  construct¬ 
ing  probability  plots  affects  the  parameter  uncertainties  as  well 
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as  the  parameters  themselves.  Thus  the  same  transformations  should 
be  used  to  reconstruct  proper  statistical  weights. 

b.  Maximum  likelihood  ideas  are  felt  to  be  very  appropriate  in  a  variety 
of  settings.  However,  we  have  already  flagged  several  anomalies  be¬ 
tween  standard  least-squares  fitting  and  probability  plotting.  The 
departures  seem  to  be  sufficient  to  suggest  that  the  adequacy  of 
visual  fitting  be  tested  other  than  on  the  basis  of  any  relationship 
to  maximum  likelihood. 

We  have  called  attention  to  a  number  of  ways  in  which  probability  plots  fail 
to  rigorously  meet  the  requirements  associated  with  standard  least-squares 
fitting  schemes.  Of  course  these  objections  are  generally  recognized  and  serve 
as  the  basis  for  placing  probability  plotting  in  proper  perspective.  Thus  it 
is  said  that  probability  plotting  does  not  have  a  definite  statistical  inter¬ 
pretation.  The  technique  is  recommended  for  rapid  visualization  of  data 
trends.  It  should  be  used  to  discard  models  that  are  conspicuously  inappro¬ 
priate  but  not  be  relied  on  to  select  the  best  model  from  several  apparently 
good  ones.  The  Weibull  model  which  is  very  interesting  because  of  its  versa¬ 
tility  is  also  difficult  to  determine  well  via  the  probability  plotting  approach. 
That  is,  the  parameter  values  obtained  tend  to  carry  large  uncertainties. 

Least-squares  fitting  schemes  can  be  tailored  to  more  directly  deal  with 
some  of  the  features  of  the  probability  plotting  problem.  For  example  the  need 
to  rescale  statistical  weights  can  be  obviated  by  fitting  to  a  cumulative 
distribution  directly  rather  than  to  a  function  linearized  through  coordinate 
transformations.  Bevington  discusses  least-squares  fitting  to  an  arbitrary 
function  in  Chapter  11  of  Ref.  17.  Equation  (82)  is  generalized  to 
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where  y(x)  is  an  arbitrary  function  of  x  and  a  set  of  n  paraneters  aj.  The 
function  y(x)  may  be  expanded  in  a  Taylor  series.  Retaining  terms  to  first 
order  in  the  parameter  increments  6aj,  Eq.  (86)  becomes 
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where  yG(x)  is  an  initial  estimate  of  the  desired  fitting  function.  Requiring 
as  before  that  the  derivatives  of  Eq.  (87)  with  respect  to  5aj  simultaneously 
vanish  yields  for  k=l,...,n 
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Equations  (88)  are  a  set  of  n  simultaneous,  linear  equations  in  6aj,  the 
corrections  to  the  initial  parameter  estimates.  Since  Eqs.  (88)  are  only  asym- 
totically  correct  as  y0(x)  approaches  the  true  regression  profile,  iteration 
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is  required.  Equations  (88)  are  solved  using  determinantal  or  matrix  inversion 
methods.  The  resulting  6a;  are  used  to  construct  an  improved^ test  function 
yQ(x,  a  ■ )  -*  y0(x,  a j  +  6a-).  The  process  is  repeated  until  Xz  reaches  a  stable 
minimum.  Convergence  is  rapid  usually  requiring  only  three  to  five  iterations. 

A  full  blown  iterative  least-squares  fitting  program  for  matching  the 
Weibull  cumulative  distribution  to  median  rank  versus  t ime-to-f ailure  data  is 
presented  as  Appendix  F. .  The  program  is  written  taking  the  a  priori  statistical 
weights  l/o^  =1.  It  includes  calculation  of  the  uncertainties  to  be  associ¬ 
ated  with  the  final  fitting  parameters  based  on  the  error  analvsis  developed  by 
Bevington^.  Although  the  fitting  program  is  set  up  to  treat  the  Weibull 
cumulative  distribution,  its  use  is  much  less  restricted.  The  reader  may  use 
Appendix  E  for  other  curve  fitting  problems  simply  by  substituting  another 
fitting  function  and  its  first  derivatives  with  respect  to  each  of  the  parameters 
i  nvol  ved . 

Unfortunately  improved  curve  fitting  does  not  solve  all  the  problems 
associated  with  interpreting  time-to-failure  information.  Ascertaining  the 
parameters  of  time-to-failure  distributions  raises  general  questions  in  the 
theory  of  order  statistics  and  is  discussed  in  greater  detail  in  Chapter  5  of 
Ref.  11.  This  source  refers  in  particular  to  an  impressive  series  of  papers  by 
N.  R.  Mann  dealing  largely  with  the  Weibull  model. 

b . 3  Quality  of  Description 

The  plotting  and  curve  fitting  methods  just  discussed  relate  to  the  question 
of  estimating  the  parameters  of  cumulative  time-to-failure  distributions  that 
best  repres  nt  observed,  ordered  time-to-failure  data.  When  these  results  are 
obtained,  one  necessarily  inquires  about  their  quality  or  dispersion.  A  couple 
of  ways  of  addressing  this  kind  of  question  are  discussed  in  the  next  two 
subsections  of  the  report.  In  reliability  studies  the  conclusions  one  draws 
from  this  are  usually  not  very  satisfying  owing  in  part  to  limited  data  but 
mostly  to  the  distributional  aspects  of  the  problem.  Thus  the  parameters  of 
interest  exhibit  substantial  uncertainties.  This  has  nothing  to  do  with  the 
power  or  efficiency  of  regression  analysis  per  se.  Nevertheless  it  is  a 
frustration  to  those  interested  in  characterizing  hardware  reliability. 

6.3.1  Dispersion  Estimates 

Evans**5  in  connection  with  the  use  of  probability  papers  gives  graphical 
constructions  for  estimating  the  errors  to  be  associated  with  linearized  visual 
regression  analysis  parameters.  Actually  his  description  incorporates  results 
of  one  of  the  statistical  goodness-of-f i t  tests  discussed  in  Section  6.3.2. 
Bevington*^  discusses  the  estimation  of  uncertainties  associated  with  parameters 
obtained  by  curve  fitting.  These  are  of  course  related  to  the  overall  "goodness- 
of-fit"  obtained  in  the  analysis.  But  there  is  a  distinction  between  error 
analysis  and  goodness-of-f i t  tests.  The  former  measures  the  dispersion  of 
parameters  obtained  by  curve  fitting.  The  latter  measure  the  probability  that 
the  observed  data  in  fact  belong  to  the  distribution  tested  (with  its  parameters 
specified ) . 
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In  this  section  we  pursue  the  error  analysis  line  of  reasoning  briefly. 

As  Bevington  points  out  the  error  "a  •  associated  with  a  fitting  parameter  a- 

J  1 


due  to  the  experimental  errors  collectively,  according  to  the  weighted  sum 
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Using  this  result  the  squared  uncertainties 
(two-parameter  linear  regression)  are 
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where  A  is  given  by  Eq.  (85c).  The  method  is  readily  generalized  to  higher 
order  linear  regression  models  and  to  curve  fitting  to  an  arbitrary  function. 
Bevingtonl ?  constructs  the  logical  extension  to  the  latter  case  by  considering 
the  change  in  a  single  parameter  which  produces  a  unit  increase  in  the  least 
squares  fitting  statistic  X'  [F,q.  (86)]  minimized  with  respect  to  the  other 
parameters.  All  of  these  cases  are  then  neatly  described  by  expressing  parameter 
uncertainties  in  terms  of  the  error  matrix  e  via 
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The  error  matrix  is  defined  as  the  inverse  of  the  curvature  matrix  (in  fitting 
parameter  hyperspace).  Thus 
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where  the  elements  of  a  are  given  by 
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if  the  data  point  uncertainties  are  unknown,  they  can  be  estimated  from  the 
overall  data  record  itself  via 
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parameter  uncertainties. 


6.3.2 


Goodness-of-Fit  Tests 


Goodness-of-f it  tests  are  structured  to  provide  some  measure  of  the  like¬ 
lihood  that  a  given  set  of  observations  (sample  test  results)  in  fact  belong 
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to  some  specified  distribution.  The  distribution  parameters  may  be  given  a 
priori  or  obtained  from  analyzing  the  data  record  itself.  Two  of  the  hetter 
known  goodness-of-f i t  tests  are  discussed  in  this  section.  These  are  the 
test  developed  by  Pearson18  and  the  Kolmogorof f-Smirnov  test.  References 
11,  19,  and  20  all  provide  chapters  on  goodness-of-f it  or  statistical  infer¬ 
ence  tests  with  the  latter  two  being  most  appropriate  for  our  present  purpose. 
All  of  these  sources  identifv  additional  reference  material.  Kececioglu-1 
gives  a  particularly  lucid  description  of  the  practical  application  of  these 
two  methods. 


The  X  test  is  based  on  the  proposition  that  the  statistic 
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is  approximate! v  X2  distributed  with  v  =  k-m-1  degrees  of  freedom  where 


k  =  number  of  class  intervals  into  which  the  data  are  grouped 


m  =  number  of  parameters  of  the  test  distribution  obtained  from 
the  data  record  itself 

0^  =  observed  event  frequency  in  the  irl1  class  interval 

Ej  =  theoretically  expected  event  frequency  in  the  itl>  class  interval. 


Roughly  speaking  the  preferred  number  of  classes  is  5 ,  7 ,  or  9  depending  on 
whether  the  total  number  of  observations  is  of  order  10,  100,  or  1000  respectively. 
One  also  prefers  that  each  class  interval  contain  at  least  5  events.  Class 
intervals  need  not  all  be  the  same  size  in  event  parameter  space  (time,  cycles 
to  failure,  etc.)  to  accomplish  the  latter.  A  number  of  examples  of  grouping 
data  and  setting  up  the  test  are  provided  in  the  references  cited  above. 

Ultimately  one  calculates  the  test  statistic  according  to  Eq.  (95)  and  compares 
it  to  tabu lated  percentile  values  of  the  chi-square  distribution.  One  can 
write  the  probability  statement 

[  X(l-a) ,v 

1-a  =  P  X2  5  X(2!_a)>vj  =  '  f  (>  )dx2 .  (96) 

J  0 

Equation  (96)  states  that  if  the  inequalitv  (equality)  is  satisfied,  the  -iginal 
test  hypothesis  is  confirmed  at  the  a  level  of  significance.  For  example  if 
a  =  0.05  and  v  =  4,  there  is  only  a  5%  probability  that  the  test  statistic 

X2  will  exceed  the  critical  value  x£  =  X2o.95,4  =  9.49  [note  some  tables 
give  X2a  v  rather  than  X2{i-a),vl  f°r  a  Riven  set  observations  for  which  the^ 
initial  ’distributional  hypothesis  is  correct.  Occurrence  of  the  outcome  x‘  >  X2 
is  considered  unlikely  (at  the  specified  risk  or  significance  level  a)  and  is 
therefore  the  basis  for  rejecting  the  original  hypothesis. 


As  an  example  let  us  test  the  proposition  that  the  time-to-failure  data  of 
Table  V  belong  to  a  two-parameter  Weibull  distribution  with  parameters  8  =  2.3 
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and  n  =  10,000  brs  as  obtained  frcxn  Fig.  22.  implementation  of  tbe  X2  test 
is  shown  in  Table  VI.  Notice  that  we  have  contrived  to  have  5  events  per 
class  interval  by  choosing  unequal  class  intervals.  Having  only  20  data  points 
to  work  with  is  still  a  little  confining  and  allows  only  4  classes  in  connec¬ 
tion  with  the  former  choice.  As  we  see  from  Table  VI  tbe  two-parameter  Weibull 
model  is  not  rejected  at  the  5%  significance  level. 

The  reader  is  referred  to  tbe  literature  for  additional  operational  level 
information  associated  with  implementation  of  tbe  X?  test.  To  summarize,  the 
essential  elements  of  the  method  are: 

1.  Select  tbe  distribution  (pdf)  to  be  tested. 

2.  Choose  the  desired  level  of  significance  a. 

3.  Specify  the  parameters  of  the  distributional  hypothesis  (perhaps  by 
fitting  tbe  data  itself). 

4.  Decompose  event  space  into  class  intervals. 

5.  Tally  the  observed  data  by  class  to  obtain  tbe  observed  frequencies. 

6.  Calculate  tbe  expected  class  frequencies  (by  taking  differences  of  tbe 
cumulative  of  tbe  test  distribution  evaluated  at  the  class  boundaries). 

7.  Form  the  test  statistic  X2  [see  Eq.  (95)]. 

8.  Calculate  the  number  of  degrees  of  freedom  for  tbe  problem. 

9.  Compare  the  test  statistic  with  the  critical  value  obtained  from  tables 
of  the  X' -distribution. 

Tbe  Kolmogorof f-Smirnov  test  statistic  d  is  the  maximum  absolute  difference 
of  two  cumulative  distribution  functions  for  some  observed  set  of  values  of  tbe 
independent  variable  x.  Thus  for  n  observations 

d  =  max|S  (x  )  -F(x  )  J  ,  (97) 

n  r  r  1  ’ 

where 

Sn(x1.)  -  observed  cdf  at  r^  failure 

F(xr)  =  hypothesized  cdf  at  x  =  xr» 

Equation  (97)  is  asymtotical ly  distributed  as22 

OO 

lira  p(d  >  C//n }  =  2  l  (-l)m_1  exp(-2m2C2)  .  (98) 

n  °°  m=  1 

This  result  together  with  exact  calculations  of  the  probability  P(d  >  C//n)  for 
small  n  allow  tables  of  critical  values  of  the  Kolmogorof f-Smi rnov  test  statistic 
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to  bo  cons t rurt ed . 2 ^ . 24  p^is  information  is  now  also  commonly  reproduced  in 
texthooks  treating  statistical  inference  and  statistical  methods  for  reliability. 
In  evaluating  Eq.  (97)  the  observed  cumulative  distribution  at  the  r*-^1  failure 
is  estimated  hy  the  rank  fraction 

W  =  r/n  *  (99) 


Or  if  the  data  are  organized  into  m  groups  (rather  than  n  groups  of  1), 
Eq.  (99)  generalizes  to 
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where  the  m  quantities  rR  are  the  failure  order  numbers  corresponding  to  the 
upper  group  boundaries.  As  was  done  for  the  X?  test  the  Ko lmogorof f-Smi rnov 
test  statistic  is  compared  with  tabulated  critical  values  corresponding  to  some 
stated  level  of  significance.  Application  of  the  test  again  to  the  data  of 
Tahle  V  is  displayed  in  Table  VII.  We  conclude  as  before  that  the  two-parameter 
Weibull  model  (B  =  2.3,  n  =  10,000  hrs)  cannot  be  rejected. 


Again  we  might  summarize  the  basic  features  of  the  Kolmogorof f-Smirnov 
goodness-of-f it  test. 

1.  Select  the  cumulative  pdf  to  be  tested. 

2.  Choose  the  desired  significance  level. 

3.  Specify  the  parameters  of  the  distributional  hypothesis  (preferably 
not  by  fitting  the  observed  data). 

4.  Tabulate  the  observed  data  by  rank  fraction  to  obtain  the  experimental 
cumulative  distribution. 

5.  Calculate  the  corresponding  expected  cumulative  distribution  values 
for  the  test  hypothesis. 

6.  Take  differences  of  (4.)  and  (5.)  and  identify  the  Kolmogorof f-Smirnov 
test  statistic. 

7.  Compare  the  test  statistic  with  tabulated  values  and  draw  a  conclusion. 


Some  provisos  associated  with  the  use  of  the  two  goodness-of-fit  tests 
described  in  this  section  of  the  report  are: 

O 

1.  The  X  test  is  preferred  for  evaluating  discrete  distributions  while  for 
continuous  distributions  one  should  favor  the  Kolmogorof f-Smirnov  test. 

2.  The  X?  test  is  suitable  for  situations  where  the  values  of  parameters 
used  in  specifying  the  test  hypothesis  are  obtained  from  the  same  data 
record  as  is  used  in  the  test  itself. 

3.  The  conditions  of  item  (2.)  compromise  the  Kolmogorof f-Smi rnov  test. 
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4.  The  data  grouping  required  in  the  X  test  precludes  its  use  with  very 
small  samples. 

*> .  There  are  no  restrictions  on  applying  the  Kolmogorof f-Smirnov  test 
in  small  sample  situations. 

h . 4  Tes ting — Context  and  Cost 

Much  of  this  report  thus  far  has  related  to  structuring  the  analytical 
machinery  for  interpreting  reliability/ life  information.  Testing  is  simply  the 
systematic  exercising  of  equipment  to  yield  failures  from  which  reliability 
inferences  can  be  drawn.  Thus  given  copious  amounts  of  this  kind  of  information 
we  can  construct  models  and  evaluate  their  parameters.  Often  this  takes  the 
form  of  trying  to  ascertain  whether  contractural  obligations  are  being  met  in 
connection  with  a  particular  procurement.  This  latter  kind  of  evaluation  is 
called  acceptance  testing.  In  acceptance  testing  one  is  concerned  with  the 
tradeoff  problem  of  being  fair  to  both  consumer  and  producer  while  at  the  same 
time  being  reasonably  precise  about  discriminating  between  superior  and  inferior 
equipment.  The  situation  is  usually  quantified  in  terms  of  the  producer's  risk 
a,  the  consumer's  risk  0,  and  the  discrimination  ratio  k.  The  producer's  risk 
is  the  probability  that  equipment  of  adequate  quality  will  be  rejected  by  the 
test.  The  consumer's  risk  is  the  chance  the  buyer  takes  that  actually  inferior 
hardware  will  bo  judged  acceptable.  The  discrimination  ratio  is  the  quotient 
of  the  nominal  or  upper  level  of  desired  performance  (MTBF  for  example)  and  the 
minimum  acceptable  or  lower  performance  level.  These  quantities  are  identified 
in  Fig.  23  which  is  one  form  of  the  operating  characteristic  (OC)  curve  which 
shows  for  a  particular  underlying  distribution  the  probability  of  passing  an 
acceptance  test  versus  the  true  performance  attribute  of  the  equipment  being 
evaluated.  The  detailed  shape  of  the  OC  curve  depends  on  the  sample  size  and 
the  level  of  performance  demanded.  Reduced  consumer  and  producer  risks  and 
higher  discrimination  (smaller  k)  require  more  testing. 

In  general  the  statistical  interpretation  of  reliability  tests  can  become 
quite  involved.  One  needs  to  consider  whether  the  test  is  time  terminated, 
failure  terminated,  censored,  or  uncensored  and  whether  the  data  obtained  are 
time-to-failure,  failures  per  Interval,  or  total  failures  in  total  time 
information.  Does  one  know  in  advance  from  what  distribution  the  sample  Is 
drawn  or  is  this  to  be  inferred  from  the  test?  Our  purpose  is  not  to  explore 
all  of  these  avenues  here  but  rather  to  focus  on  certain  economies  that  have 
been  developed.  A  great  deal  of  modern  acceptance  testing  is  based  on  the 
pioneering  work  of  Abraham  Wald25  in  the  area  of  sequential  testing.  In  this 
case  for  a  specified  underlying  distribution  one  establishes  an  open  ended  test 
plan  and  keeps  track  of  a  probability  ratio  statistic  relating  to  the  probabil¬ 
ities  that  the  observed  number  of  failures  belong  to  a  realization  of  the 
upper  and  lower  performance  limits.  The  accept/re ject  decision  is  based  on 
the  behavior  of  this  statistic  and  such  a  test  is  called  a  probability  ratio 
sequential  test  or  simply  a  sequential  test.  Sequential  testing  is  roost  fully 
developed  for  the  experiential  case.  A  variety  of  test  plans,  operating  charac¬ 
teristics,  and  expected  test  time  characteristics  for  this  situation  are  dis¬ 
played  in  Ref.  26.  A  typical  format  for  sequential  testing  sampling  plans  or 
decision  making  plots  is  that  shown  in  Fig.  24,  Cumulative  failures  are  plotted 
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versus  cumulative  equipment  operating  time.  Migration  of  the  stepwise  plot 
line  outside  the  "continue  testing"  region  results  in  an  accept  or  reject 
decision  being  reached.  The  truncation  boundaries  t~  and  r„  are  due  to 

07  u  u 

Fpstein.^' 

Additional  details  concerning  the  design  of  a  sequential  test  plan  tailorec 
to  a  particular  application  are  presented  in  Ref.  2.  We  do  not  elaborate  on 
this  here  because  we  are  ultimately  more  interested  in  making  actual  hardware 
reliabilitv  improvements  than  fine  tuning  the  evaluation  process  for  the 
exponential  model.  Sequential  testing  has  much  to  recommend  it,  however. 

Wald^'’  has  shown  that  a  sequential  plan  has  an  average  risk  no  greater  than  a 
test  where  the  sample  size  is  chosen  in  advance.  On  the  other  hand  good  units 
are  promptlv  accepted  while  bad  units  are  rejected  efficiently  in  terms  of  the 
test  time  required  to  make  a  decision.  Reduced  time  and  expense  associated 
with  testing  is  a  principle  advantage  of  the  sequential  approach.  Not  surpris- 
inglv  (though  perhaps  unfortunately)  the  greatest  test  time  required  is  asso¬ 
ciated  with  coming  to  a  decision  when  the  true  reliability  is  close  to  the 
design  objectives  (upper  and  lower  test  limits).  While  the  economies  of  sequen¬ 
tial  testing  are  real,  the  amount  of  time  that  must  be  invested  in  hardware 
evaluation  is  still  significant.  Inspection  of  the  test  plans  of  Ref.  26 
shows  that  the  cumulative  test  time  required  ranges  typically  from  2  to  20 
times  the  MTBF  value  being  demonstrated. 

Ip  the  case  of  routine  acceptance  testing  one  often  has  well  developed 
expectations  concerning  how  the  test  should  turn  out.  This  may  be  based  on 
experience  with  similar  equipments  previously  evaluated.  In  such  cases  it  is 
possible  to  realize  an  additional  reduction  in  required  test  time  by  using  the 
celebrated  and  controversial  methods  of  Bayesian  inference.  This  subject  area 
has  developed  around  the  conditional  probability  theorem  first  established  by 
Bayes^  over  two  hundred  years  ago.  For  the  reader  to  whom  Bayesian  inference 
is  new,  Ref.  29  is  a  suggested  point  of  departure.  Reference  29  is  a  special 
issue  of  "IEEE  Transactions  on  Reliability"  devoted  exclusively  to  Bayesian 
inference.  Most  of  the  papers  have  a  review  orientation.  One  of  the  advertised 
benefits  of  Bayesian  theory  is  that  it  allows  subjective  or  personal  preference 
kinds  of  inputs.  There  is  continuing  dialogue  concerning  whether  this  is 
permissible  in  science,  how  it  should  be  done,  and  what  Bayesian  forms  are 
appropriate  in  treating  reliability  problems. 

Following  Refs.  11  and  30  Bayes  theorem  may  be  stated  as 

P(B | A  )P(A  ) 

P(A1|B)  =  — - - - - -  ,  (101) 

l  P(B| A  )P(A  ) 
i=l  1  1 

where  in  the  reliability  context  the  elements  of  Eq.  (101)  have  the  following 
interpretations : 

A^  a  set  of  mutually  exclusive  and  exhaustive  (for  B)  hypotheses  or 

belief  statements 

B  an  event  or  piece  of  evidence  that  relates  to  the  truth  or  credi¬ 

bility  of  the  A^ 
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P(A^)  elements  of  the  prior  probability  distribution,  that  is,  the 
probabilities  assigned  to  the  hypotheses  before  evidence  B 
becomes  available 

P (B I At )  likelihoods  or  conditional  probabilities  that  the  evidence  B 
will  obtain  assuming  the  truth  of  each  of  the  Aj  separately 

P(AjJb)  posterior  probabilities  of  the  Aj  given  the  evidence  B  . 

The  denominator  of  the  right  side  of  Eq.  (101)  is  the  total  probability  of  the 
evidence  B  calculated  by  weighting  the  P ( B  I A  ^ )  by  the  hypothesis  probabilities 
over  the  entire  ensemble.  That  Eq.  (101)  is  a  correct  logical  statement  is  not 
disputed.  However,  if  unrealistic  prior  information  is  supplied,  conclusions 
drawn  from  using  Eq.  (101)  may  be  expected  also  to  be  unrealistic  and  of  little 
value.  This  is  the  center  of  the  multifaceted  Bayesian  controversy.  Is 
mathematical  convenience  sufficient  justification  to  prefer  conjugate  forms  of 
the  theory  (prior  and  posterior  distributions  belonging  to  the  same  functional 
family)?  Are  prior  distributions  unsupported  by  actual  data  to  be  considered 
legitimate?  Should  one  prefer  continuous  or  discrete  descriptions?  There  are 
other  pitfalls  to  the  uninitiated.  Some  forms  of  the  theory  emphasize  what  are 
called  loss  and  risk  functions  (see  Refs.  20  and  21  for  example).  In  this 
approach  one  seems  to  be  more  concerned  with  the  Impact  of  his  decisions  than 
their  empirical  basis. 

It  is  not  our  purpose  here  to  present  or  elaborate  on  Bayesian  inference 
theory  in  any  detail  (This  has  been  the  subject  of  a  number  of  books  and  very 
many  technical  papers.).  We  simply  wish  to  call  attention  to  its  existence  and 
its  apparent  relevance  to  reliability  problems.  Perhaps  some  additional  guidance 
and  accession  to  the  literature  can  be  provided  as  well.  Reference  31  considers 
sequential  testing  from  a  Bayesian  viewpoint  and  shows  sampling  plans  quite 
suggestive  of  Fig.  24  of  this  report.  Reference  32  discusses  obtaining  prior 
distributions  from  available  data  for  actual  hardware  equipments  (mostly 
electronic).  Reference  30  presents  a  very  appealing  demonstration  of  the 
advantages  of  a  discrete  Bayesian  formulation  and  the  practicalities  of  its  use 
in  treating  reliability  problems.  Intriguingly  this  source  suggests  that  viable 
Bayesian  prior  distributions  be  arrived  at  by  committee  in  what  amounts  to  an 
engineering  design  review  setting.  A  direct  comparison  of  fixed  sample  size, 
sequential,  and  Bayesian  reliability  demonstration  testing  plans  with  respect 
to  their  relative  efficiency  in  terms  of  required  test  time  is  given  in  Ref. 

33.  Obviously  when  it  can  be  properly  structured,  Bayesian  inference  is  very 
efficient. 

As  has  been  mentioned  this  report  is  more  concerned  with  improving  the 
design  of  sonar  transducers  than  evaluating  current  production.  Nevertheless 
design  improvement  begins  by  trying  to  keep  what  is  right  about  the  item  in 
question  and  change  what  is  wrong.  With  respect  to  both  of  these  categories  in 
the  sonar  setting  there  seems  to  be  a  wealth  of  information  which  one  might 
like  to  process  using  Bayesian  methods.  The  scope  of  this  report  does  not 
allow  the  development  of  solutions  of  this  kind  here.  Only  encouragement  to 
carry  on  can  be  provided. 
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7.0 


SPECIAL  RELIABILITY  DIFFICULTIES  FOR  NAVY  SONAR  EQUIPMENT 


Thus  far  in  this  report  we  have  developed  a  number  of  topics  that  are  a 
part  of  the  standard  machinery  for  dealing  with  reliability  problems.  The  area 
of  wet-end  sonar  equipment  offers  some  unique  challenges  in  applying  these 
methods  as  we  shall  see  in  this  section. 


7.1  Heroic  Time  Scale 


Numbers  like  100,000  hours  have  been  written  into  recent  sonar  transducer 
procurements  as  the  required  mean-t ime-between-failures  statistics  (MTBF's)  for 
these  equipments.  I  am  not  suggesting  that  such  a  performance  objective  is 
unrealistic.  Earlier  it  was  pointed  out  that  sonar  transducers  are  rather 
uncomplicated  and  long  life  should  be  realizable  on  the  basis  of  their  structural 
simplicity  (and  sound  design).  Nevertheless  the  MTBF's  called  for  must  be 
recognized  as  large  numbers.  By  way  of  comparison  the  subsystem  MTBF's  of 
modern  jet  fighter  aircraft  range  from  a  few  hours  to  tens  or  hundreds  of  hours. ^ 
Taking  these  elements  together  the  entire  aircraft  may  exhibit  an  MTBF  in  the 
range  0.5  to  3  hours. ^  The  sonar  transducer  reliability  statistic  is  seen  to 
be  5  orders  of  magnitude  larger  than  this.  It  becomes  unrealistic  to  construct 
a  conventional  acceptance  test  to  assure  the  Navy  that  it  is  receiving  what  it 
bargained  for.  In  the  limited  procurement  setting  such  a  test  would  be 
prohibitively  expensive  and  the  results  would  not  be  timely.  New  approaches  to 
characterizing  the  reliability  of  long  lived  systems  should  be  cultivated. 

This  might  take  the  form  of  demonstrating  reliability  after  the  fact  through 
fleet  experience  and  contriving  to  achieve  it  in  future  procurements  through 
controlled  engineering  practices. 


7 .2  Gaps  in  the  Quality  and  Kind  of  Hazard  Rate  Data 

Contractors  bidding  on  sonar  transducer  procurements  are  usually  asked  to 
prepare  a  handbook-style  prediction  of  the  reliability  of  the  item  in  question. 
This  ordinarily  requires  quite  a  bit  of  creativity  since  the  standard  data 
sources  such  as  Ref.  7  do  not  provide  the  necessary  information.  Nevertheless 
the  task  is  invariably  completed  and  a  predicted  reliability  slightly  superior 
to  that  requested  by  the  Navy  is  advertised.  (One  could  hardly  do  otherwise 
and  expect  to  win  the  contract.)  All  aspects  of  this  exercise — what  the  Navy 
asks  for  and  what  the  contractors  deliver — seem  a  bit  misdirected.  Structural 
arguments  have  been  presented  that  one  would  look  for  wearout  phenomena  rather 
than  the  exponential  reliability  modeled.  Fleet  service  data^  acquired  on 
TR-155F  transducers  incorporated  in  the  AN/BQQ-5  sonar  system  confirm  this. 

These  data  represent  a  characteristically  wearout-like  cumulative  time-to- 
failure  function.  In  this  case  the  life-limiting  process  is  identified  as 
corrosion  and  debonding  along  the  rubber-window/headmass-shroud  interface. 

Our  thesis  in  this  section  is  that  a  basis  for  exponential  prediction  modeling 
of  sonar  transducers  does  not  in  general  exist.  Further  there  is  presently 
insufficient  information  available  to  engage  in  any  detailed  predictive  modeling 
of  a  new  design  that  is  significantly  altered  from  its  predecessors.  The  call 
to  action  in  this  is  that  the  sonar  community  itself  must  develop  its  own  rele¬ 
vant  da^a  sources  and  reliability  experience. 
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•  Need  to  Define  Systems  Operating  Requirements 

In  Section  5.1  we  foreshadowed  the  need  for  precise  definitions  of  the 
performance  requirements  of  hardware  of  interest.  For  example  in  the  sonar 
case  the  reliability  required  of  a  transducer  will  depend  on  the  array 
configuration  of  similar  units  and  the  desired  array  performance.  Some  studies 
have  been  carried  out  hv  sonar  systems  personnel  relating  to  the  degradation  of 
array  beam  forming  characteristics  as  a  function  of  elements  lost  to  service. 

One  needs  further  to  relate  this  kind  of  analysis  to  characteristics  of  ultimate 
interest  such  as  target  recognition  capability.  It  is  only  when  the  operating 
characteristics  of  the  system  are  defined  and  related  to  array  parameters  that 
the  reliability  specialist  can  construct  a  specification  for  a  single  transducer 
coordinated  with  overall  mission  objectives.  It  is  often  taken  as  a  general 
rule  that  an  array  must  be  907  intact  to  function  adequately.  To  improve  on 
this  description  requires  closer  cooperation  between  sonar  systems  and  reliability 
personnel  than  has  heretofore  been  practiced. 


7.^  Undefined  Process  Endpoints 

In  order  to  avoid  wearout  failures  it  is  desirable  to  employ  preventive 
maintenance  as  a  tactic.  Often  this  involves  monitoring  some  component  attribute 
and  replacing  the  part  when  a  specified  service  limit  is  reached.  Alternatively 
a  regular  replacement  interval  may  be  established  without  regard  to  evaluating 
the  apparent  condition  of  the  item  involved.  In  the  sonar  transducer  case 
there  are  identified  wearout  processes  for  which  a  service  limit  has  not  been 
specified.  For  example,  how  much  corrosion  of  the  housing  can  be  tolerated 
before  the  risk  of  perforation  is  considered  unacceptably  high?  Or  a  much  more 
tantalizing  illustration  is  the  following:  It  is  felt  that  it  is  undesirable 
to  have  water  inside  a  transducer  due  to  its  role  in  promoting  corrosion  and 
degrading  electrical  breakdown  characteristics.  Klastomers,  particularly 
neoprene  and  polyurethane  rubbers,  are  often  used  as  the  primary  moisture 
harriers  in  projector  and  hydrophone  installations.  But  it  is  known  that  these 
materials  are  permeable  to  mois ture . 36 > 3 7  Thus  the  important  question  is  not 
whether  water  is  present  in  transducers  but  how  much  can  be  tolerated.  Dessicants 
are  often  incorporated  to  reduce  transducer  humidity  levels.  But  the  basic 
question  together  with  what  its  life-limiting  implications  are  remains  unanswered. 
Moisture  is  a  concern  in  both  gas-filled  and  oil-filled  transducers.  In  the 
latter  case  the  solubility  oc  water  is  an  important  fill  fluid  characteristic. 
There  have  been  transducer  designs  of  both  the  gas-filled  and  oil-filled  types 
that  have  given  good  service.  Still  a  full  understanding  of  why  some  designs 
outperform  others  seems  to  await  research  on  the  role  water  plays  in  fostering 
wearout  processes. 


7 . 5  Test  Method  Nonuniformity 

Some  of  the  complicating  features  already  mentioned  in  connection  with 
sonar  applications  have  led  to  differing  responses  within  the  community.  Thus 
a  variety  of  ways  of  evaluating  equipment  have  evolved.  Before  a  transducer  is 
mounted  on  a  ship,  definitive  acoustical  evaluation  can  be  carried  out  at  any 
of  several  specialized  Naval  facilities.  After  transducers  are  mounted  some 
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arrav  evaluation  work  can  be  performed  using  prepared  targets.  For  the  most 
part,  however,  simpler  schemes  for  testing  transducers  are  preferred.  Tims 
most  transducer  diagnostic  activity  involves  making  simple  resistance  measurements 
at  inhoard  terminal  boxes  servicing  the  transducer  electrical  cables.  To  what 
degree  these  resistances  correlate  with  the  acoustical  performance  of  correspondin 
units  is  not  established.  The  test  is  not  designed  to  distinguish  hetween 
cable  difficulties  and  anomalous  behavior  of  the  transducer  itself.  In  some 
svstems  there  is  a  disconnect  criterion  at  which  point  an  individual  transducer 
is  no  longer  felt  to  be  beneficially  contributing  to  the  overall  array  performance 
Thus  wt en  the  resistance  of  a  given  transducer  falls  below  this  critical  value, 
the  unit  is  electrically  removed  from  service.  There  is  generally  another 
service  limit  on  transducer  resistance  which  calls  for  replacement  during  an 
overhaul.  I'suallv  this  replacement  houndary  represents  substantial  deterioration 
of  the  resistance  specified  for  new  equipment. 

Several  aspects  of  this  situation  are  somewha f  disquieting  from  a  reliabilitv 
evaluation  point  of  view.  Different  sorts  of  tes!  entirely  (acoustical  versus 
risistance)  are  associated  with  the  qualification  of  new  equipment  and  its 
ultimate  removal  from  service.  The  disconnect  criterion  probably  has  a  basis 
in  signal  processing  theorv.  On  the  other  hand  the  replacement  resistance 
riterion  seems  arbitrary  and  but  little  related  to  acoustical  performance. 

7  .  f  Need  for  Systematic  and  Uniform  Data  Acquisition 

For  a  variety  of  reasons  stated  above  reliability  prediction  associated 
with  sonar  transducer  procurements  is  very  difficult.  The  time  scale  is  heroic, 
production  is  often  quite  limited,  and  operating  requirements  and  life-limiting 
processes  are  frequently  incompletely  characterized.  Thus  until  some  of  the 
basic  open  questions  relating  to  permeation  effects,  bond  degradation,  corrosion, 
electrical  breakdown,  etc.  are  answered,  it  seems  unrealistic  to  expect  to 
realize  specific  reliability  objectives  in  connection  with  any  given  new 
procurement.  However,  all  transducer  systems  are  monitored  and  maintained. 

Thus  hy  focusing  attention  on  transducers  in  place  in  the  fleet,  one  ought  to 
be  able  to  identify  problem  areas  and  suggest  improvements  to  be  implemented  on 
later  models.  We  have  already  seen  that  limited  testing  during  hardware  design 
can  also  flag  overt  problems  and  lead  to  much  Improved  equipment.  In  the  latter 
case  the  benefits  are  more  immediate.  In  both  ases  the  equipment  may  he  better 
but  the  thrust  of  prediction  is  frustrated  in  that  one  does  not  know  how  good. 

In  the  sonar  setting  it  seems  most  realistic  to  demand  an  answer  to  this  question 
only  after  fleet  experience  Is  acquired.  If  this  experience  is  satisfactory, 
build  new  units  in  the  same  way.  If  improved  performance  is  desired,  identify 
the  problem  areas  and  make  changes. 

The  hasis  for  progress  along  the  lines  just  discussed  is  eternal  vigilance. 
That  is  the  performance  of  sonar  systems  should  be  systematically  monitored 
and  complete  records  kept.  Care  should  he  taken  that  tests  are  uniformly 
applied.  For  the  purpose,  resistance  measurements  are  certainly  admisslhle 
whether  or  not  correlations  with  acoustical  performance  are  established.  It  is 
recognized  that  sonar  transducer  diagnostics  can  be  developed  only  on  a  limited 
opportunity  basis.  This  in  itself  Is  not  a  fundamental  problem  but  does  serve 
to  punctuate  the  need  for  keeping  good  records  and  treating  each  maintenance 


interval  as  an  opportunity  to  acquire  reliability  data.  To  the  procurement 
manager  seeking  near  term  results,  this  program  may  seem  inadequate.  No  superior 
alternative  suggests  itself  although  parallel  efforts  on  the  development  of 
accelerated  test  methods  is  probably  worthwhile.  One  should  remember  that  the 
present  state  of  the  reliability  improvement  art  for  sonar  transducers  is  well 
described  as  stumbling  from  crisis  to  crisis.  It  is  only  by  recognizing  where 
one  is  beginning  that  plans  for  progress  can  be  well  structured. 
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8.0 


RECOMMENDATIONS 


Thus  far  In  this  report  we  have  tried  to  identify  the  nature  of  the  sonar 
transducer  reliability  problem.  In  addition  some  of  the  relevant  analytical 
machinery  for  dealing  with  reliability  has  been  introduced.  Now  we  will  treat 
hrieflv  some  of  the  ways  in  which  the  latter  may  be  correctly  applied  to  the 
former.  Some  technical  and  some  operational  concerns  will  be  considered. 


8 . 1  Methods  Applicability 


As  has  been  observed  a  description  of  sonar  transducer  reliability  has 
some  unique  features.  At  least  some  wearout  processes  occur  in  fleet  service 
and  in  certain  cases  tend  to  dominate  the  reliability  description.  Thus  for 
the  most  part  handbook  prediction  methods  of  either  the  Part  Stress  Analysis  or 
Parts  Count  type  seem  to  be  inappropriate.  Of  course  a  formal  separation  into 
component  groups  which  nay  he  exponentially  reliable  or  exhibit  wearout  can 
always  be  made.  The  overall  reliability  is  the  product  of  the  subclass 
reliabilities.  Failure  rate  experience  needed  to  support  such  an  approach, 
when  available  at  all,  is  likelv  to  be  dispersed  among  various  production 
contractors  rather  than  centrally  cataloged. 


The  probabilistic  design  approach  discussed  in  Section  A. 2. 2  is  generally 
applicable  in  principle.  It  is  a  microscopic  description  enabling  one  to 
characterize  reliability  on  a  per-stress-appl ication  basis.  The  connection 
between  probabilistic  design  and  what  we  have  called  macroscopic  reliability 
was  dealt  with  in  Sections  5.2.1  and  5.2.2.  Probabilistic  design  requires  a 
distributional-level  description  of  environmental  and  service  stresses  and 
component  strengths.  When  this  information  is  available,  the  method  is  very 
powerful.  However,  acquiring  the  information  for  a  particular  problem  area  of 
interest  may  necessitate  a  separate  research  program. 


We  see  that  in  some  ways  the  nature  of  the  sonar  reliability  problem  has  a 
pejorative  impact  on  efforts  to  apply  standard  methods.  In  other  ways  the 
situation  is  ameliorating.  The  distribution  of  the  hardware  of  interest  to  us 
is  limited  to  a  single  customer--the  Navv.  This  customer  is  highly  organized 
and  meticulous  in  dealing  with  maintenance  and  renewal  functions.  The  inventory 
control  process  is  itself  practically  a  macroscopic  reliability  experiment. 

The  Navy  sonar  setting  also  exhibits  a  rather  controlled  evolutionary 
characteristic.  New  procurements  are  typically  only  slightly  altered  from  the 
generation  of  hardware  being  supplanted.  Manv  features  such  as  ceramic 
configurations,  prestressing  arrangements,  and  coupling  and  decoupling  provisions 
are  pretty  well  established.  In  this  setting  of  slow  change  Bayesian  inference 
methods  would  seem  to  be  particularly  appropriate. 


In  this  section  it  is  our  purpose  to  evoke  neither  optimism  nor  gloom.  We 
simply  wish  to  point  out  that  a  reliability  program  can  be  no  stronger  than  the 
true  overlap  of  the  methods  employed  with  the  problem  addressed.  Obviously 
such  a  program  must  be  structured  by  individuals  capable  of  exercising  the 
necessary  critical  judgment  in  the  somewhat  esoteric  reliability  arena. 
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8.2 


Recognizing  the  Statistical  Character  of  the  Problem 

The  title  of  this  section  is  somewhat  curious  in  a  report  for  which  the 
statistical  aspects'  of  reliability  have  represented  a  major  focus  throughout. 

But  one  significant  feature  has  not  yet  been  emphasized — the  question  of  product 
similarity.  We  have  seen  that  reliability  cannot  be  related  mac.roscopically  to 
the  failure  of  a  single  item.  Rather  inferences  must  be  drawn  from  the  behavior 
of  a  population  of  similar  items.  Ah  but  given  such  a  situation,  can  we  tell 
whether  the  units  are  similar  (or  nearly  identical)  or  not.  let  us  look  into 
this.  Suppose  that  a  number  of  items  of  the  same  kind  are  in  fact  exponentially 
reliable  (an  assumption  not  requiring  justification  for  the  moment)  but  exhibit 
different  constant  hazard  rates.  Take  these  hazard  rates  X  to  be  normally 
distributed  as 
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Then  since  the  reliability  of  an  individual  component  is  R(  t )  =  e-^*1  and 
Eq.  (102)  is  a  normalized  pdf,  the  reliability  of  the  population  is  found  by 
weighting  the  constituent  reliabilities  according  to 
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liquation  (103)  is  easily  evaluated  yielding 
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The  short  term  (small  t)  behavior  of  Eq.  (104)  is  the  same  as  if  all  units  had 
the  same  reliability 

R(t )  =  e^X1  (105) 

as  the  typical  or  "average"  unit  (X  =  p^).  However  as  time  passes  the  hazard 
rate  centroid  of  the  population  decreases  as  failures  tend  to  favor  removal  of 
less  reliable  units.  Equations  (105)  and  (104)  are  plotted  in  Fig.  25  to  permit 
this  effect  to  be  displayed  graphically  for  the  case  Ox  =  0.25px  (25%  dispersion 
of  the  hazard  rates).  The  two  curves  differ  very  little. 

Before  continuing  we  should  notice  that  Eq.  (104)  exhibits  a  minimum  at 
t  =  p^/o^2  and  diverges  at  t  =  <*>.  This  anomalous  behavior  is  due  to  the  finite 
area  under  the  left  tail  of  Eq.  (102)  representing  a  small  probability  of  hazard 
rates  near  zero  and  even  negative.  This  is  of  little  practical  concern  and  use 
of  Eq.  (104)  is  proper  for  p^t  «  (Px^°X^»  the  situation  shown  in  Fig.  25. 

To  decide  whether  nominally  alike  components  are  nearly  identical  from  a 
reliability  point  of  view  or  exhibit  significant  dispersion,  it  will  be  necessary 
to  distinguish  profiles  like  the  smooth  curves  shown  in  Fig.  25.  The  step 
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function  also  displayed  in  the  figure  represents  a  Monte  Carlo  simulated 
measurement  involving  10  units.  Ten  hazard  rates  X.j  distributed  per  Eq.  (102) 
were  chosen.  Then  since  the  cumulative  single  component  unreliabilities 
U.  =  1  -  e  ^ic  are  uniformly  distributed  (see  Ref.  19  pages  62  and  63),  the 
simulated  failure  times  are  given  by 

t.  =  -y-ln(l-U.),  (106) 

where  the  or  l-0j  are  random  numbers  on  the  interval  0  to  1 .  What  1  call  a 
poor  man's  Monte  Carlo  simulation  was  employed  for  constructing  Fig.  25.  Thus 
the  random  numbers  were  generated  essentially  by  throwing  darts  at  a  telephone 
book  rather  than  via  a  fancy  computer  algorithm.  The  associations  of  particular 
X's  and  U's  were  also  established  by  chance.  The  step  function  of  Fig.  25  is 
only  representative  and  not  unique.  Repeating  the  simulation  will  yield  a 
different  detailed  outcome.  The  same  is  of  course  true  of  actual  experiments 
vielding  time-to-failure  data.  We  can  see  that  a  very  refined  experiment  indeed 
is  required  to  select  one  of  the  curves  of  Fig.  25  as  preferred  over  the  other. 

One  cannot  conclude  that  reliability  dispersion  effects  are  fundamental lv 
indistinguishable.  But  as  a  practical  matter  for  items  expected  to  be 
exponentially  reliable,  significant  variations  in  the  reliability  parameters  of 
"similar"  items  are  not  likely  to  be  observed  via  the  customary  cataloging  of 
times  to  failure  under  similar  test  conditions.  This  is  not  to  be  construed 
particularly  as  an  argument  favoring  the  probabilistic  design  approach  to 
reliability  discussed  above  or  the  physics-of-aging  posture  advocated  by  Thomas-*®. 
We  are  simply  trving  to  characterize  and  develop  insights  relating  to  the 
macroscopic  approach  to  reliability  evaluation.  The  stochastic  aspects  of  the 
problem  preclude  finding  answers  to  questions  that  are  too  detailed.  On  the 
other  hand  probing  exactly  these  informational  limits  is  the  price  of  progress. 

Earlier  we  argued  that  reliability  statements  about  an  individual  item 
could  be  made  only  by  studying  a  population  of  similar  units.  Now  it  appears  that 
the  required  similarity  is  very  difficult  to  demonstrate.  It  seems  that  we 
have  come  full  circle  in  the  sense  that  observations  of  individual  units  may 
serve  only  to  collectively  characterize  a  population.  The  situation  is  probablv 
not  as  gloomy  as  it  begins  to  sound.  Very  likely  it  is  easier  to  build  similar 
components  through  meticulous  control  of  manufacturing  processes  than  it  is  to 
demonstrate  that  this  has  been  done.  In  any  event  there  are  several  lessons  to 
be  learned  from  this.  The  statistical  nature  of  the  problem  should  temper  the 
kinds  of  questions  one  asks.  Reliability  experiments  should  be  carefully 
thought  out  with  respect  to  the  relevancy  and  adequacy  of  the  information  to  be 
developed.  In  thinking  macroscopically  about  reliability  problems  it  is  often 
helpful  to  relax  the  tendency  to  look  for  rigid  associations  of  cause  and 
effect.  Statistical  problems  are  what  they  are  largely  because  of  their 
indeterministic  features.  Fortunately  most  reliability  analysis  methods  do  not 
depend  on  a  priori  product  similarity.  Assurance  of  similarity  is  needed  only  if 
one  wishes  to  make  sharp  statements  about  the  expected  performance  of  an  individ¬ 
ual  item  based  on  population  studies.  In  the  case  of  sharply  clustered  wearout 
times  to  failure  in  parallel  tests,  the  data  record  itself  provides  this  informa¬ 
tion.  For  the  random  hazard  situation  the  analogous  connection  is  very  weak 
(Fig.  25)  and  one  needs  to  insist  that  the  units  be  of  "identical"  manufacture. 
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8.3 


Data  Requirements 


To  be  consistent  with  the  general  scope  of  this  recommendations  section, 
data  requirement  guidelines  rather  than  comprehensive  responses  to  particular 
situations  are  suggested.  In  gathering  information  from  which  inferences 
relating  to  product  reliability  and  life  are  to  be  drawn  one  often  tests 
nominal lv  identical  items  under  controlled  conditions.  This  is  done  against  a 
sharply  defined  standard  of  acceptable  performance.  Failure  may  be  taken  to  he 
any  departure  of  an  operational  or  physical  parameter  from  the  established 
norms.  Monitoring  the  properties  of  interest  and  comparing  with  the  relevant 
failure  thresholds  yields  a  set  of  times  at  which  failures  occur.  This  kind 
of  time-to-failure  information  is  the  preferred  form  from  which  to  construct 
distributional  analyses  from  the  macroscopic  viewpoint.  In  dealing  with  deployed 
sonar  transducer  arrays,  opportunities  for  evaluation  may  be  quite  restricted. 
This  naturally  leads  to  cumulative  failure  information  in  a  failures-per-interval 
format.  The  approach  is  quite  instructive  provided  100%  testing  is  carried  out 
at  each  checkpoint. 

Inputs  to  the  probabilistic  design  approach  to  reliability  evaluation  are 
quite  detailed  as  has  been  mentioned  above.  Kececioglu  documents  some  of  these 
needs  in  Refs.  39  and  40.  Basically  one  requires  distributional  information  on 
applied  stresses,  component  strengths,  failure  governing  criteria,  and  a  variety 
of  environmental,  processing,  and  materials  characteristics.  Kececioglu39,40 
has  enunciated  an  appeal  to  the  engineering  community  to  improve  upon  the 
limited  availability  of  information  of  this  kind. 

The  use  of  Bayesian  inference  methods  in  dealing  with  reliability  problems 
begins  with  the  construction  of  a  prior  distribution.  This  requires  previous 
experience  with  the  same  or  similar  types  of  hardware.  A  more  consistently 
articulated  evaluation  of  fleet  operations  than  has  been  carried  out  previously 
may  be  required,  but  the  inventory  of  wet-end  sonar  equipment  seems  well  suited 
to  the  application  of  the  Bayesian  approach. 

All  three  reliability  analysis  methods  mentioned  in  this  section  ideally 
can  be  arranged  to  imply  a  time-to-failure  probability  density  function  type  of 
description.  The  form  of  this  function  is  of  course  very  suggestive  in 
classifying  phenomena  leading  to  failure.  Thus  to  some  extent  corrosion, 
fatigue,  etc.  often  exhibit  generally  characteristic  signatures.  One  can  look 
for  microscopic  confirmation  of  these  by  studying  basic  physical  processes, 
i.e.  via  failure  analysis. 


Product 


Strategies 


Probably  the  single  most  important  consideration  involved  in  improving  the 
reliability  of  military  hardware  is  to  officially  recognize  that  there  is  a 
problem.  When  this  is  done  personnel  are  encouraged  to  catalog,  dissect,  and 
interpret  observed  failures  and  the  basis  for  understanding  the  causes  is 
established.  Improvements  can  grow  out  of  such  an  appreciation  of  the  situation. 
Naturally  the  most  constructive  way  that  this  sort  of  feedback  can  impact 
hardware  configurations  is  early  in  the  design  phase.  With  respect  to  weapons 
systems  at  least  the  Department  of  Defense  has  formally  adopted  this  posture  by 
issuing  Directive  5000.40.  This  document  (discussed  in  Ref.  41)  restructures 
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procurement  procedures  as  they  relate  to  achieving  reliability  .and  maintainability 
as  well  as  performance  objectives.  The  directive  does  two  things--it  recognizes 
the  problem  and  calls  for  solutions  to  be  developed  heginning  with  the  earliest 
engineering  phases  of  a  procurement  program. 

For  its  part  the  Navv  has  adopted  a  very  progressive  approach  at  top 
management  levels  largely  in  the  person  of  W.  J.  Willoughby,  Jr.,  Deputy  Chief 
of  Naval  Material  for  reliabilitv,  maintainability,  and  quality  assurance. 
Willoughby's  approach  is  detailed  in  a  recent  interview  in  Ref.  42.  Basically 
he  feels  the  evidence  now  stronglv  supports  the  contention  that  engineering 
discipline  and  manufacturing  controls  are  better  methods  for  achieving  reliability 
than  is  some  form  of  proof  testing.  Willoughby's  posture  seems  to  be  very 
flexihle.  Contractors'  ingenuity  and  creativity  are  allowed  to  blossom  rather 
than  being  rigidly  restrained. 

Thus  far  in  this  section  we  have  not  discussed  specific  upgrading  techniques 
such  as  surface  preparation,  burn  in,  or  process  temperature  control.  These 
specifics  grow  out  of  a  more  fundamental  commitment  to  success  by  the  people 
involved  with  a  given  project.  And  in  fact  the  techniques  just  named  relate  to 
two  distinctly  different  philosophies  of  achieving  the  desired  results.  Burn  in 
is  an  example  of  the  flaw  precipitation  approach.  Components  are  regarded  as 
vulnerable  to  the  inclusion  of  flaws — defects  which  deteriorate  into  failures 
with  repeated  application  of  stress.  Burn  in  is  designed  to  promote  these 
incipient  failures  and  preempt  inferior  units  from  seeing  actual  service. 

Surface  preparation,  process  temperature  control,  and  chemical  quality  control 
are  examples  of  steps  taken  during  manufacture  to  avoid  flaws  in  the  final 
product.  The  upgrading  strategy  is  to  make  a  superior  product  through  strict 
process  controls  rather  than  to  select  accidently  better  units  by  a  post¬ 
fabrication  sorting  method.  One  hundred  percent  testing  may  still  be  desirable, 
not  to  induce  failures,  but  to  deduce  which  units  never  worked  to  legin  with. 

Some  additional  observations  can  be  made  regarding  efforts  to  use  testing 
to  actually  improve  product  reliability  directly.  Recall  that  components  that 
exhibit  truly  exponential  reliability  are  not  degraded  by  use  until  their 
intrinsic  strength  limits  are  reached.  Thus  protracted  low  level  testing  is  of 
no  consequence.  Proof  testing  in  this  case  should  be  brief  involving  only 
loading  to  the  maximum  stress  levels  of  interest.  Good  units  are  not  damaged 
by  this;  inferior  specimens  are  destroyed.  This  procedure  doesn't  work  in 
wearout  situations.  Wearout  is  characterized  by  the  accumulation  of  damage 
under  extended  use.  Thus  protracted  testing  or  perhaps  a  judiciously  designed 
accelerated  test  is  required  to  demonstrate  wearout  reliability.  However, 
passing  such  a  test  leaves  hardware  heavily  aged  and  unfit  for  its  intended 
application.  Wearout  testing  serves  to  characterize  similar  equipment  rather 
than  qualify  the  particular  items  tested.  Essentially  the  reverse  is  true  for 
exponentially  reliable  components  although  some  insights  concerning  expectations 
for  similar  items  would  also  be  developed  in  this  way. 

An  essential  recommendation  that  comes  from  all  of  this  is  that  the  proper 
role  of  testing  is  diagnostic.  Combined  with  failure  analysis  it  helps  one 
identify  what  areas  need  improvement.  The  improvement  should  be  accomplished 
by  design  change,  material  selection,  altered  processing,  etc.  not  by  more 
testing.  In  the  case  of  long  lived  sonar  equipment  there  may  be  cases  where 
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improvements  can  be  made  in  response  to  test  information  that  seems  to  be  rather 
incomplete.  That  is,  solutions  to  a  problem  can  be  developed  more  easily  than 
would  be  a  full  characterization  of  the  reliability  impact  of  the  situation. 

Another  recommendation  for  upgrading  hardware  is  to  get  the  manufacturers 
constructively  involved.  Don't  force  the  response  of  contractors  to  be  adher¬ 
ence  to  some  (perhaps  obsolete)  bureaucratic  norm.  Instead  communicate  objec¬ 
tives  and  let  the  contractors'  engineering  staffs  determine  how  to  meet  them. 


8 . 5  Incentives 

In  this  section  no  specific  answers  or  solutions  are  provided.  We  simply 
wish  to  focus  attention  on  a  continuing  need  if  reliability  benefits  are  to  be 
most  effectively  realized.  It  has  been  stated  that  front-end  investment  in 
reliability  produces  a  ten-fold  payback  in  maintenance  and  repair  expense 
avoided  later.  The  exact  figure  depends  on  the  specific  situation  but  is 
nevertheless  significant.  For  top-level  procurement  managers  this  in  itself 
ought  to  be  a  splendid  incentive.  But  how  are  middle  managers  and  junior 
operatives  rewarded  if  they  save  the  average  taxpayer  a  few  dollars?  And  what 
is  the  attitude  of  the  hardware  vendor?  By  building  better  equipment  does  he 
reduce  his  level  of  repeat  business?  If  so  this  is  a  negative  incentive.  If 
the  corporate  executive  even  suspects  (correctly  or  not)  that  a  better  product 
has  an  adverse  economic  impact  in  his  area  of  responsibility,  he  will  not  be 
expected  to  work  for  improved  reliability.  Thus  a  workable  benefit  situation 
needs  to  be  defined  at  the  level  of  every  relevent  profit  center. 

There  are  those  idealists  for  whom  the  opportunity  to  do  good  work  is  its 
own  reward  (The  author  likes  to  think  of  himself  as  such.).  But  for  the  most 
part  our  political  and  economic  systems  are  based  on  the  notion  that  services 
should  be  inspired  by  and  rewarded  with  some  sort  of  (hopefully  equitable)  wage 
or  its  equivalent.  In  the  military  equipment  area  we  cannot  afford  to  tolerate 
unreliability  and  it  debilitating  side  effects. ^ .  But  if  the  needed  reliability 
is  to  be  achieved,  the  economic  pie  must  be  sliced  in  such  a  way  that  personnel 
at  all  levels  on  both  the  consuming  and  producing  sides  recognize  that  the 
common  good  is  in  their  personal  best  interest.  As  Willoughby^  points  out  the 
bottom  line  is  quality,  ultimately  the  quality  of  the  people  committed  to  the 
success  of  the  venture.  The  average  worker  is  not  going  to  be  motivated 
simply  by  a  chance  to  cast  his  lot  on  one  side  or  another  of  a  nebulous 
ideological  struggle.  Other  ways  need  to  be  identified.  In  discussing  this 
informally  with  Navy  personnel,  the  author  has  found  some  reluctance  to  take 
the  idea  of  economic  incentives  seriously.  There  seems  to  be  precedent  for  this 
approach,  however.  I  have  been  told  the  Canadian  Air  Force  pays  some  kind  of 
premium  if  equipment  purchased  from  the  United  States  exceeds  specified 
reliability  objectives. ^  Developing  an  effective  incentives  posture  may  be 
the  single  most  significant  way  in  which  Navy  procurement  managers  can  impact 
the  reliability  problem. 


8,6  Management  Needs 

In  this  section  we  discuss  attributes  that  would  serve  well,  individuals 
charged  with  upgrading  the  reliability  of  hardware  being  procured.  The  reader 
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should  begin  to  realize  from  this  report  alone  that  the  quest  for  reliability 
is  complicated  by  the  nature  of  the  reliability  problem.  As  a  discipline 
reliability  is  subtle,  tricky,  heavily  mathematical,  statistically  based,  and 
enormously  important.  A  procurement  manager  does  not  need  to  be  a  thorough¬ 
going  reliability  specialist  (reliability  is  not  his  only  concern)  but  he 
should  be  sufficiently  accomplished  to  obtain  competent  help  and  avoid  being 
bamboozled  by  fast-talking  associates  or  adversaries.  In  the  author's  opinion 
there  is  a  great  deal  of  conventional  wisdom  being  misapplied  in  the  name  of 
reliability  these  days.  Thus  the  procurement  manager  needs  to  be  able  to  cut 
to  the  heart  of  relevant  issues  and  to  be  capable  of  forming  independent  judg¬ 
ments.  In  addition  it  is  necessary  to  define  realistic  objectives  in  connection 
with  any  particular  program.  There  are  of  course  various  levels  of  reliability 
management  ranging  from  overall  policy  determination  to  incentives  development 
and  supervision  of  technical  implementation  tasks.  At  the  higher  levels,  of 
course,  most  operational  details  are  left  to  others.  Nonetheless  top  management 
people  need  to  be  conscientious  and  well  informed.  Their  decisions  have  consid¬ 
erable  impact. 

Another  factor  relating  to  the  kind  of  talent  needed  in  reliability  man¬ 
agement  is  the  dynamic  character  of  the  field.  Methods  development  and  refine¬ 
ment  are  continuing  areas  of  activity.  Failure  analysis,  probabilistic  design, 
accelerated  testing,  and  Bayesian  inference  are  all  evolving  areas.  The  need 
for  continuing  education  is  apparent.  Happily  much  is  being  done  to  meet  this 
need.  Many  fine  textbooks  and  a  great  number  of  periodical  publications  treat 
a  wide  variety  of  reliability  topics  (Access  to  an  enormous  literature  is  gained 
by  referring  to  the  secondary  sources  cited  by  entries  in  the  reference  section 
of  this  report.).  This  author  is  largely  self-taught  using  such  materials  but 
can  also  recommend  a  number  of  very  beneficial  institutes,  seminars,  and  short 
courses  sponsored  on  a  continuing  basis  by  The  University  of  Arizona,  The 
George  Washington  University,  and  the  Reliability  Analysis  Center.  The  lat¬ 
ter  is  a  division  of  The  Illinois  Institute  of  Technology  housed  at  Griffiss 
Air  Force  Base,  Rome,  New  York.  The  continuing  education  offerings  are  timely, 
incisive,  and  in  some  cases  directed  specifically  to  management  issues. 
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9.0  CONCLUSIONS 

This  report  has  been  prepared  with  the  intention  of  providing  an  integrated 
overview  of  the  reliability  field  for  technical  and  managerial  personnel  con¬ 
cerned  with  upgrading  wet-end  sonar  equipment.  An  attempt  was  made  to  present 
the  [material  in  sufficient  detail  to  permit  the  reader  to  digest  and  interpret 
other  work  and  put  into  perspective,  problems  in  his  own  particular  area  of 
interest.  I  cannot  unilaterally  say  that  this  effort  has  been  successful. 

Such  a  determination  awaits  the  collective  judgment  of  users  of  this  document. 
For  the  author  at  least  this  study  has  served  as  a  probe  of  the  scope  of  the 
reliability  problem  generally  and  the  very  large  amount  of  work  surrounding 
it.  Only  a  bit  of  scratching  at  the  surface  of  this  body  of  information  has 
been  accomplished  in  these  pages.  Nevertheless  the  author  feels  that  such  a 
step  is  necessary  to  stimulate  the  kind  of  dialogue  that  will  lead  to  creative 
solutions  to  reliability  problems  in  the  specialized  sonar  field. 

Distinctions  relating  to  testing  such  as  whether  tests  are  time  terminated, 
failure  terminated,  censored,  or  accelerated  have  not  been  sharply  drawn. 

Manv  specific  reliability  situations  of  interest  have  necessarily  been 
completelv  Ignored  in  the  report.  However  a  variety  of  subject  areas  and  source 
materials  are  Identified  for  those  who  want  to  pursue  particular  aspects  in 
greater  detail.  It  is  the  author's  impression  that  there  is  considerable  opera¬ 
tional  level  misunderstanding  about  what  can  and  can't  and  should  and  shouldn  t 
he  done  in  meaningfully  addressing  reliability  problems.  The  reader  is  cau¬ 
tioned  to  guard  against  pitfalls  of  this  nature  and  hopefully  provided  with 
some  of  the  tools  needed  to  make  critical  judgments. 

Quite  fortunately  a  strong  commitment  to  superior  hardware  reliability 
has  been  made  by  the  Navy  at  the  top  levels  of  management.  This  has  taken 
shape  for  weapons  systems  in  the  "New  Look"  philosophy  emphasizing  the  incor¬ 
poration  of  reliability  and  maintainability  efforts  in  the  design  phase  of 
hardware  procurement.  Encouraging  preliminary  results  are  becoming  available 
for  some  of  the  earliest  programs  handled  in  this  way.  Both  the  Navy  and  t  e 
contractors  involved  are  pleased  with  what  seems  to  be  significant  progress 
and  the  way  they  have  worked  together  to  achieve  it. 

1  am  not  sure  whether  the  attention  given  to  sonar  transducer  reliability 
in  recent  years  is  part  of  the  official  "New  Look"  or  not  (possibly  a  case  of 
my  not  seeing  the  forest  for  the  trees).  If  it  is,  then  a  new  look  at  the 
"New  Look"  is  recommended.  It  seems  to  me  that  the  intended  and  proper  thrust 
of  the  "New  Look"  philosophy  is  progressive,  flexible,  and  unconcerned  with 
the  perpetuation  of  any  conventional  wisdom  that  has  become  counterproductive. 
In  this  light  perhaps  the  common  efforts  to  construct  exponential  handbook 
reliability  models  for  wearout  problems  should  be  discarded  as  anachronistic. 

It  is  not  clear  that  contractors  do  not  view  these  exercises  simply  as  busy 
work— part  of  the  red  tape  associated  with  doing  business  with  the  Government. 
The  special  scale,  longevity,  and  accessibility  situation  for  wet-end  sonar 
equipment  suggests  further  that  we  concern  ourselves  more  with  actual  improve¬ 
ments  rather  than  emphasizing  the  evaluation  question  per  se. 
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Not  a;?  a  conclusion  hut  simply  as  a  concluding  remark  the  author  would 
like  to  invite  and  stronglv  encourage  critical  feedback  from  the  reader  concer 
ning  the  usefulness  of  this  report  in  dealing  with  his  particular  reliability 
concerns.  It  is  often  through  interaction  and  interdisciplinary  cross  fertill 
zation  that  collective  problems  are  most  effectivelv  addressed. 
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Table  I.  Important  Reliability  Functions  and  Relationships 
Function  Name/Description 

R(t)  Reliability  --  Probability  of  system  success 

U(t)  Unreliability  —  Cumulative  failure  distribution  function 

f(t)  Time-to-failure  probability  density  function  (p.d.f.) 

A(t)  Hazard  rate  —  Instantaneous  failure  rate 

MTBF  Mean  time  between  failures  —  Expected  life 

R(t,T)  Conditional  reliability  for  mission  of  duration  T  beginning 
at  time  t 


Equation  No. 

(la, lb) 

(2) 

(3a, 3b, 3c) 

(4a ,4b) 

(5a, 5b) 

(6) 

(7) 
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Table  II.  Environment  and  Quality  Ti-Factors  for  General-Purpose  Diodes 


ENVIRONMENT 


Ground,  Benign  (Gg)  |  1 

I 

Space,  Flight  (SF)  I  1 

I 

Ground,  Fixed  (Gp)  I  5 

Ground,  Mobile  (G^)  I  10 

Naval,  Sheltered  (Ng)  I  12 

I 

Naval,  Unsheltered  (N^)  I  20 

Airborne,  Inhabited,  Transport  (Ajp)  I  25 

I 

Airborne,  Inhabited,  Fighter  (Ajp)  I  25 

I 

Airborne,  Uninhabited,  Transport  (Ayp)  I  25 

I 

Airborne,  Uninhabited,  Fighter  (App)  I  40 

I 

Missile,  Launch  (Mp_)  I  40 


QUALITY  LEVEL  I  *q 

JANTXV  j  0.15 

I 

JANTX  I  0.3 

I 

JAN  I  1.5 

I 

Lowe  r  I  7 . 5 

I 

Plastic  I  15.0 


J 


i 
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Table  III.  Fifteen-Step  Mechanical  Reliability  Prediction  and  Design- 
for-Rel iab il ity  Methodology  (taken  from  Reference  8). 


1.  Define  the  Design  Problem  and  Determine  the  Mission  Profile 

2.  Determine  the  Design  Variables  and  Parameters  Involved 

3.  Conduct  a  Failure  Modes,  Effects,  and  Criticality  Analysis 

A.  Determine  the  Dependence  or  Independence  of  the  Component's 
Failure  Modes 

5.  Determine  the  Failure  Governing  Criterion  Involved  in 
Each  Failure  Mode 

6.  Formulate  the  Failure  Governing  Stress  Function 

7.  Determine  the  Distribution  of  Each  Design  Stress  Variable 
and  Factor  for  Each  Failure  Mode 

8.  Determine  the  Failure  Governing  Stress  Distribution  for 
Each  Failure  Mode 

9.  Formulate  the  Failure  Governing  Strength  Function 

10.  Determine  the  Distribution  of  Each  Design  Strength  Variable 
and  Factor  for  Each  Failure  Mode 

11.  Determine  the  Failure  Governing  Strength  Distribution  for 
Each  Failure  Mode 

12.  Determine  the  Component's  Reliability  for  Each  Failure  Mode 

13.  Determine  the  Component's  Reliability  for  All  Failure  Modes 

1A .  Determine  the  Overall  Component  Reliability  Considering  All 
Failure  Modes  Involved 

15.  Determine  the  Confidence  Limit  on  the  Calculated  Component 
Re liabi li ty 
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input  data 


Table  IV.  Corrosion  Model  Computational  Parameters  and  Functions 


LABEL  DEFINITION 


NAME 


-t _ 


a 

b 

(oc/p, 

z 

(mc/m 

Az 

[vjv 

r0)At 

S' 

A 

1  -  Z 

B 

bz 

C 

a  +  bz2 

yS’ 

A2  + 

C 

aS' 

-A4)** 

dy5, 

dz 

2  (  A  - 

b) 

das, 

dz 

(A/as,)  (u 

<j  i  B  ~  AC] 

* 

(a  +a)S* 

-P5,)  /a 

f  (z) 

1  V/2 

dps, 

dz 

-( 


dqS' 

dz 


squared  coefficient  of 
variation — radius 

squared  coefficient  of 
variation — corrosion  rate 

standardized  process 
rate/ time  index 

rate/time  increment 

fractional  strength  endpoint 


(1+a)  times  mean  fractional 
s  trength 


(1+a)  times  fractional 
strength  standard  deviation 


Standardized  rate/ time  to 
endpoint  density  function 


71 


Table  V.  Ordered  Time-to-Failure  Data  and  Median  Ranks 
(Sample  Size  -  20  Units) 


FAILURE 

NUMBER  1  j 

1 

1 

TIME  TO  FAILURE  tj 
(hours) 

~r 

1  MEDIAN  RANK 

(MR)  j 

■  (percent) 

1 

1 

1  1 

1 ,9  A3 

1 

1 

1  3.A1 

2  1 

3,376 

1  8.25 

3  1 

A, 180 

1  13.15 

A  1 

A, 311 

I  18.06 

5  I 

5, 12  A 

I  22.97 

6  1 

5,976 

I  27.88 

7  | 

6,  A16 

1  32.80 

8  1 

7,250 

1  37.71 

9  1 

7,761 

1  A2.63 

10  1 

8,2A5 

1  A7.5A 

11  1 

8,528 

1  52. A6 

12  1 

9,226 

1  57.37 

13  1 

10,AA7 

I  62.29 

1A  1 

10,508 

1  67.20 

15  ! 

11,028 

1  72.12 

16  1 

11 ,A62 

1  77.03 

17  1 

12,803 

1  81.94 

18  1 

12,998 

1  86.85 

19  1 

13,026 

1  91.75 

20  1 

1 

16 ,0A2 

1  96.59 

1 
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Table  VI.  Application  of  the  x2  r,oodness-of-Fit  Test  to  the  Data  of  Table 
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. 


0.099  <  3.8^  -*•  Hypothesis  not  rejected 


Table  VII.  Application  of  the.  Kolmogorof f -Smirnov  Goodness-of-Fit 
Test  to  the  Data  of  Table  V. 


TIME  TO  | 
FAI LURE  1 
(hours)  1 

1 

1 

1 

1 

1 

RANK  FRACTION  | 
OF  | 

OBSERVED  DATA  | 

1 

V*r>  =  £ 

1 

THEORETICAL  | 

CUMULATIVE 
DISTRIBUTION  j 
F(xr)  =  | 

8=2.3  n=104  hrs| 

ABSOLUTE 

DIFFERENCE 

|Sn(xr)- F(xr 

1 ,9A3  I 

0.05 

0.023 

0.027 

3,376 

0.10  | 

0.079  I 

0.021 

A,  180  I 

0.15  | 

0.126  I 

0.024 

A, 311  | 

0.20  I 

0.13A  | 

0.066 

5.12A  | 

0.25  | 

0.193  j 

0.057 

5,976  | 

0.30  | 

0.26A  | 

0.036 

6.A16  | 

0.35  | 

0.303  1 

0.0A7 

7,250  I 

0.  A0  | 

0.380  1 

0.020 

7,761  I 

0.A5  | 

0.A28  I 

0.022 

8 , 2A5  I 

0.50  I 

0.A7A  I 

0.026 

8,528  I 

0.55  | 

0.500  I 

0.050 

9,226  | 

0.60 

0.56A 

0.036 

10.AA7  | 

0.65  | 

0.669  I 

0.019 

10,508  | 

0.70  | 

0.67A  I 

0.026 

11,028  | 

0.75  | 

0.71  A  I 

0.036 

11.A62  | 

0.80  | 

0.7A6  I 

0.05A 

12,803  1 

0.85  | 

0.829  1 

0.021 

12,998  | 

0.90  | 

0.839  1 

0.061 

13,026  | 

0.95  | 

0.8A1  I 

0.109 

16.0A2  ! 

1.00  | 

0.9A8  I 

0.052 

n  =  20 

a  =  0.05 

da(n)  =  0.294  (critical  value  from  published  tables) 

d  =  max|  Sn(xr)  -  F(xr)  |  =  0.109  <  0.294  -*■  Hypothesis 

not  rejected 
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a.  Exponential  Time-to-Failure  PDF 


b.  Exponential  Reliability 


c.  Constant  Hazard  Rate 

Figure  2.  Random  Hazard  Reliability  Functions 


6 


f(t)  =  (l/a/ziOexpj^-  J 


R(t) 


c.  Increasing  Hazard  Rate 

Figure  3.  Wearout  Reliability  Functions  (Normal  PDF) 


Figure  4.  Wearoutf  Reliability  Functions  (Log  Normal  PDF) 


IS 


rvvr 


f(t)  =  0. 5t 


-0.5  -t 

e 


0.5 


R(t) 


c.  Decreasing  Hazard  Rate 
Figure  5.  Early  Failure  Reliability  Functions 


t 


Figure  7.  Probability  Distribution  of  MTBF  Estimators  and 
Its  Cumulative  (Exponential  Parent  Population) 
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Figure  11.  Minimum  Acceptable  Estimated  (Demonstrated)  MTBF 
in  Mission  Duration  Units  as  a  Function  of  Number  of 
Observed  Failures  to  Demonstrate  90%  Exponential 
Reliability  at  Several  Confidence  Levels 
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Illustrative  Example  of  Overlapping  St 
and  Strength  Distributions 


4 

I  MICROCOPY  RESOLUTION  I  LSI  CHARI 

NMKttiM  kUKtMJ  OF  $1 ANOAKOS  IMF,  fty 


S  =  Strength 

U(S',t)  =  Time-Dependent  Unreliability 
(Fraction  Worn  Out) 


f 


Figuri  21.  Rank  Distributions  for  Sample  Size  10 
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Figure  25.  Population  Reliability  of  Ten  Units — 
Deterministic  and  Distributed  Failure  Rates  Compared 
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APPENDIX  A 


Reliability  as  a  Function  of  Hazard  Rate 


Let  the  hazard  rate  function  (probability  of  failure  per  unit  time)  be 
represented  as  X(t).  Consider  N  identical  units  having  hazard  rate  A ( t )  to  be 
operational  at  time  t.  The  probability  of  a  failure  occurring  in  an  infini¬ 
tesimal  interval  dt  at  t  is  NA(t)dt  which  results  in  a  change  -dN  in  the  number 
of  unfailed  units  remaining.  Thus 

-dN  =  NA(t)dt.  (Al) 


The  variables  separate  yielding 


dN 

N 


- A(t)dt  . 


(A2) 


Integration  yields 


In  N 


t 

X(t)dt  +  lnN0, 
o 


where  NQ  is  the  number  of  functioning  units  at  t=0. 
Exponentiation  or  Eq.  (A3)  yields 


(A3) 


(A4) 


But  N/Nq  is  the  fraction  of  the  initial  population  that  is  unfailed,  which 
equals  the  probability  that  a  single  device  is  unfailed.  We  identify  this 
with  single  device  reliability  R(t).  Thus 


N~  =  eXP 


X  (t)dt 


R(t) 


exp 


A(t)dt 


(A5) 
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appendix  b 


Transformation  of  Distributed  Random  Variables 

*»  tu.  -  do  ~ 

to  a^dird  sources  such  a,  Ref.  5  for  additional  Information. 

Suppose  one  ha.  specified  the  probability  density  fonctlon  f(x)  of  a 
continuous  random  variable  x.  If  a  change  of  var  a  e 

y  -  y(x)  011  > 

is  introduced,  then  the  probability  den.lt,  function  of  the  ne»  variable  ,  1. 


g(y) 


f(x(y)l 


dx 

dy 


(B2) 


The  sample  op.ce.  for  £bs t«uS’ i.r'x  l!'ff«)!  ^Ltlt, 

Mxrtyl  !.  the  absolute  value  of  the  “Sl>  lo 

conveniently  evaluated  as  [dy/dx]  .  J  not  the  case,  then  the  problem 

mus^be 'decomposed  in"o  ?eg!ons  vhe^  y(x)  i,  strictly  increasing  or  decreasing 
and  Eq.  (B2)  applied  to  each  of  them  separately. 

purpose * 
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APPENDIX  C 


Evolution  of  Wearout  Profile  When  Failures  Are  Replaced  as  They  Occur 

Imagine  that  we  are  dealing  with  a  group  of  units  that  exhibit  pure  wearout 
behavior  with  a  tlme-to-failure  probability  density  function  fj(t). 


1st  generation 


Figure  Al.  Graphical  Relationship  of  1st  and  2nd  Generation 
Wearout  Profiles 


First  generation  units  that  wear  out  and  are  replaced  in  the  interval  dt '  at 
t'  contribute  a  distribution  df2(t,t')  to  the  second  generation  of  times  to 
failure  where 


df2(t,t') 


•  f  x  (t '  -  p i )dt '  i  • 
.  /f j(t '  - Pj)dt '  j  . 


fj(t  -  t' 


(Cl) 


Or  since  fi  is  normalized 


df2(t,t’>  =  fjU-t*  -m)f  1(t •  -p^dt'. 


iC2  ) 
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Introducing  the  variable  changes  a  =  t  -2ui  and  fl  =  t'  -  U\  and  integrating 
over  t'  yields  the  2°d  generation  time-to-f allure  profile 


f2(t) 


fj(a  -6)f  j(B)dB. 


(C3  ) 


Equation  (C3  )  is  a  standard  convolution  integral.  If  we  cake  f|(t)  to  be  a 
Gauss-normal  function,  that  is 


fx(B)  = 


/2tT o  ] 


f^a-6)  = 


B#’] 

f  1  f  a  -  6  i 2  " 

exPf  il-TT")  j  * 


(CAa  ) 


(CAb  ) 


evaluation  of  Eq.  (C3  )  yields 


f2(t)  = 


J 2t\  (/2  a  j ) 


ift  -2ui 

exp  "  47177 


(C5  ) 


Equation  (  C*i )  is  itself  Gaussian  centered  at  2gj  with  a  standard  deviation 
/la]  .  This  procedure  is  readily  generalized.  One  finds  that  the  time- 
to-failure  distribution  for  the  n1*1  generation  of  wearout  failures  is  Gaussian 
with  parameters 

hn  =  nyl  ^  C6a  ) 

and 

o_  =  /n  o,  .  (  C6b  ) 


For  n  =  0  this  description  correctly  accommodates  the  simultaneous  placement 
of  the  original  units  in  service  at  t  *  0,  (The  Gaussian  with  parameters 
Uo  =  °0  =  0  is  a  Dirac  function.)  The  t ime-to-failure  distributions  broaden 
with  increasing  generation  index  number  n.  Soon  overlap  effects  become  dom¬ 
inant  and  ths  overall  system  failure  rate  obtains  via  contributions  from  many 
generations  superimposed. 


APPENDIX  D 


Binary  Synthesis  of  the  Corrosion  Model  Failure  Governing  Strength  Distribution 

Provided  the  coefficients  of  variation  of  the  quantities  involved  are  not 
too  large,  functions  of  normally  distributed  random  variables  are  themselves 
at  least  approximately  normally  distributed.  The  parameters  of  the  composite 
distribution  may  be  inferred  from  the  means  and  standard  deviations  of  the  basis 
variables  grouped  a  pair  at  a  time.  This  approach  is  referred  to  as  binary  syn¬ 
thesis  of  normal  distributions.  Kececioglu  discusses  the  method  in  Ref.  8  and 
catalogs  the  appropriate  relationships  for  a  number  of  elementary  operations. 

We  are  interested  in  evaluating  the  distribution  of  residual  strengths 
S  *=  rr T ( rQ  -  ct)z  for  the  corrosion  wearout  model  discussed  in  Section  5.4  of  the 
body  of  the  report.  Expressed  in  the  format  x  ■*  x[px,ox]  where  px  and  ox  are 
the  mean  and  standard  deviation  of  the  Gaussian  function  representing  the  dis¬ 
tribution  of  the  quantity  x,  our  modeling  assumptions  were: 


ro  =  ro(  ^r  .  °r  1 

(Dla  ) 

o  o 

C  -  c[  Pc  ,  Oc  ] 

(Dlb  ) 

T  =  T{  T  ,  0  ] 

(Die  ) 

t  -  t[  t  ,  0  ) . 

(Did  ) 

With  this  input  information  we  can  apply  the  rules 

for  multiplying  a  distributed 

variable  by  a  constant,  subtraction  of  variables,  and  squaring  to  obtain  the 
desired  results.  Thus  using  the  notation  f(x)  to  denote  the  distribution  of  x, 
building  the  parameters  of  f(S)  proceeds  by  binary  synthesis  as  follows: 


f (ct)  : 

yct  = 

°ct  “  Oct 

f(rQ  -ct): 

tJ<ro-ct> 

0(ro-ct)  “ 

f((rQ  -ct)2)  : 

p(rQ-ct)2 

(fr0  '“c')2  +  +  ■’c1' 

°(r0-ct)2  " 

104 


f(S)  : 


US  =  ^[(y^  -Uct)?  +  o2q  +  02t?l 

°S  =  1TTL4^ro“Mct)2(°ro+0ct?)  +  2(°ro+0ctO)  ]  * 


The  quantities  ys  =  ys(t)  and  CTS  =  °S^t)  are  the  c  11,16  dependent  strength 
distribution  parameters  required  for  use  in  Section  5.4. 
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APPENDIX  E 


C  LEAST  SQUARES  KIT  —  WEIBULL  CUMULATIVE  PDF 
C 

DESCRIPTION  OF  PARAMETERS: 

IRUN  -  DATA  SET  CATALOC  NUMBER 
N POINT  -  NUMBER  OF  XY  DATA  PAIRS 

X  -  ARRAY  OF  VALUES  OF  INDEPENDENT  VARIABLE  (TIME,  CYCLES,  ETC.) 
Y  -  MEDIAN  RANKS  (ETC.)  CORRESPONDING  TO  THE  X'S 
NPAR  -  NUMBER  OF  FITTING  PARAMETERS 
BETA  -  WEIBULL  SHAPE  PARAMETER 
ETA  -  WEIBULL  SCALE  PARAMETER 
GAMMA  -  WEIBULL  LOCATION  PARAMETER 

SUBROUTINE  REQUIRED:  MATINV  —  SEE  REF.  17  PAGE  302 

DIMENSION  AND  DP  STATEMENTS  VALID  UP  TO  NPAR  -  6,  NPOINT  =  100 

DIMENSION  X(100),  Y(100),  YTHEOR(lOO),  C(6),  DY(6),  S(6) 

DOUBLE  PRECISION  DELSQR,  ASCALE(6,6),  A(6,6),  B(6) 

3  F0RMAT(2F10.A) 

A  FORMAT (1H1 ,  215) 

5  FORMAT ( 1H  ,  1P6E1A.6) 

6  READ(6,A)  IRUN,  NPOINT 
WRITE(3 , A )  IRUN,  NPOINT 
WRITE (3 ,5) 

RE AD (6, 3)  (X(I),  Y ( I  )  ,  1  =  1,  NPOINT) 

NPAR  =  3 

C(I)  ARE  INITIAL  PARAMETER  ESTIMATES 

READ ( 6 , 3 )  (C(I),  1=1,  NPAR) 

DELSAV  =  1.E35 
13  DO  16  I  =  1,  NPAR 

B(I)  =  0. 

DO  16  J  =  1,  NPAR 
16  A(I,J)  =  0. 

DELSOR  =  0. 

BETA  =  C(l) 

ETA  =  C(2) 

GAMMA  =  C(3) 

BOE  =  BETA/ETA 
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non  ooo  ooo 


DO  35  N  =  1,  N POINT 
Z  =  (X(N)-GAMMA)/ETA 
ZEB  =  Z**BETA 
EXZEB  =  EXP(-ZEB) 

YTHEOR(N)  =  1.  -  EXZEB 

DY( I )  =  D(YTHEOR) /D(C( I ) )  ARE  PARTIAL  DERIVATIVES 

DY(3)  =  -BOE*ZEB*EXZEB/Z 
DY(2)  =  Z*DY(3) 

DY( 1 )  =  -ALOG (Z) *DY ( 2 )/BOE 
DEL  =  Y(N)  -  YTHEOR(N) 

DELSQR  =  DELSQR  +  DEL*DEL 
DO  35  I  =  1,  NPAR 
B( I )  =  B( I )  +  DY(I)*DEL 
DO  35  J  =  1,  NPAR 
35  A( I , J)  =  A( I , J)  +  DY( I ) *DY( J ) 

WRITE (3 ,5)  (C(I),  I  -  1,  NPAR),  DELSQR 
IF  (DABS(DELSAV-DELSQR)  .LT.  0.01*DELSQR)  GO  TO  48 
IF  (DELSAV-DELSQR)  54,  39,  39 
39  DELSAV  =  DELSQR 

RESCALE  CURVATURE  MATRIX  (DIAGONAL  ELEMENTS  =  1) 

DO  42  I  =  1,  NPAR 
DO  42  J  -  1,  NPAR 

42  ASCALEC I , J )  =  A(I , J)/DSQRT(A(I ,I)*A( J , J ) ) 

INVERT  MATRIX  AND  CALCULATE  NEW  PARAMETERS 

CALL  MATINV( ASCALE ,  NPAR,  DET) 

DO  46  I  *  1,  NPAR 
DO  46  J  -  1,  NPAR 

46  C(I)  =  C(I)  +  ASCALE (I , J )*B( J ) /DSQRT( A( I ,I)*A(J ,J)) 

GO  TO  13 

COMPUTE  PARAMETER  UNCERTAINTIES 

48  RMS  -  DSQRT (DELSQR/ FLOAT (NPOINT-NPAR) ) 

DO  50  I  *=  1,  NPAR 

50  S(I)  =  RMS*DSQRT(ASCALE(I,I)/A(I,I)) 

WRITE(3 ,5) 

WRITE(3,5)  (S(I),  I  -  1,  NPAR),  RMS 
GO  TO  6 
54  STOP 
END 
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